Is faceted navigation really eating up half of your crawl budget?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Faceted navigation (filters and sorting on e-commerce sites) accounts for nearly 50% of crawl problem reports received in 2025. It creates URL combinations that can overwhelm servers because Googlebot must crawl a large volume of URLs to determine whether they are relevant.

🎥 Source video

Extracted from a Google Search Central video

💬 EN 📅 03/02/2026 ✂ 11 statements

Watch on YouTube →

✂ Other statements from this video 10 ▾

📅

Official statement from February 3, 2026 (2 months ago)

⚠ A more recent statement exists on this topic Can a misconfigured 301 redirect actually block your pages from being indexed? Google · March 5, 2026 View statement →

TL;DR

Gary Illyes reveals that faceted navigation accounts for nearly 50% of crawl problem reports received by Google. The URL combinations generated by filters and sorting on e-commerce sites overwhelm servers and force Googlebot to crawl a massive volume of URLs to determine their relevance. It's a clear signal: the majority of e-commerce sites still haven't mastered this fundamental.

What you need to understand

What is faceted navigation and why is it such a problem?

Faceted navigation is that system of filters and sorting you find on every e-commerce site. Color, size, price, brand, customer rating — each combination generates a unique URL. On a catalog of 1000 products with 5 filters at 3 options each, you potentially end up with hundreds of thousands of URLs.

The problem? Googlebot doesn't know upfront which ones are useful. It must crawl massively to understand which pages deserve to be indexed and which are simply duplicates or empty combinations. Result: your server takes the hit with thousands of unnecessary requests.

Why does Google receive so many reports on this specific issue?

Because it's a recurring structural problem. Faceted navigation is technically simple to implement from a development perspective, but catastrophic for SEO if not properly managed. E-commerce platforms generate these URLs by default, without distinguishing between what's relevant and what isn't.

And let's be honest — many sites end up with wasted crawl budgets, server timeouts, or worse, relevant pages that never get crawled because Googlebot exhausts itself on absurd filter combinations.

What does this 50% figure really mean in concrete terms?

It means that among all crawl problems reported to Google, half involve faceted navigation. That's huge. It shows that despite years of articles and recommendations, the majority of e-commerce sites still haven't sorted out this fundamental issue.

It's also a clear indicator: if you manage a site with filters, there's a 50-50 chance you have a latent crawl problem you haven't even detected yet.

50% of crawl reports involve facets — it's the number one structural problem in e-commerce
URL combinations explode rapidly with multiple active filters
Googlebot must crawl extensively to sort relevant from superfluous
Servers can be overwhelmed if URL volume isn't controlled
This isn't a new problem, but it remains largely unsolved

SEO Expert opinion

Is this statement consistent with what we observe in the field?

Absolutely. Every SEO working in e-commerce knows this nightmare. You audit a site, you open Search Console, and you see thousands of crawled URLs corresponding to empty or redundant filter combinations. Crawl budget gets consumed by useless pages while strategic product pages remain overlooked.

What's interesting is that Google is saying this openly: it's their top reported issue. That validates what we've been repeating for years — faceted navigation, if not designed with SEO in mind, is a technical disaster.

What nuances should we add to this statement?

The 50% figure doesn't mean 50% of sites have this problem — but rather that 50% of crawl problem reports concern this issue. Important distinction. It can also reflect the fact that e-commerce sites are overrepresented in these reports, simply because they mechanically generate more URLs.

Another point: Google says Googlebot must crawl to determine relevance. Let's be clear — it's our job to make it easier for Googlebot. If you're waiting for Googlebot to figure out which pages are relevant on its own, you'll be waiting a long time and wasting crawl budget. Canonical tags, noindex, robots.txt, URL parameters in Search Console — there are tools to properly manage this.

In what cases does this rule not apply?

If you manage a brochure site or a blog, this problem probably doesn't concern you. Faceted navigation is really an issue for e-commerce, marketplaces, and classified ad sites — anything that offers multiple filters across a large catalog.

Now, even on a medium-sized e-commerce site, if you've properly configured your canonical tags and URL parameters in Search Console, you can limit the damage. The problem mainly affects those who've let the situation spiral — or who have a CMS that generates everything without safeguards.

Warning: If Search Console shows thousands of filtered URLs crawled but not indexed, that's an immediate alarm signal. Your crawl budget is being eaten alive.

Practical impact and recommendations

What concrete steps should you take on an e-commerce site?

First step: identify the relevant filter combinations. Not all filtered pages are equal. A page for "red shoes" might have search volume. A page for "red shoes size 42 leather with express shipping" probably has zero SEO value.

Next, you need to block indexation of useless combinations. Several methods: canonical to the parent page, noindex on filtered pages without potential, robots.txt to block specific URL parameters. The choice depends on your architecture and goals.

What mistakes should you avoid at all costs?

The classic mistake is leaving everything open. No canonicals, no noindex, all combinations crawlable and indexable. Result: your server slows down, your crawl budget evaporates, and you end up with massive duplicate content.

Another trap: blocking too aggressively in robots.txt. If you prevent Googlebot from crawling these pages, it can't follow internal links that pass through them. You risk cutting off crawl paths to important pages. Better to allow crawling but control indexation with noindex or canonical.

How do you verify your site is properly configured?

Open Search Console and look at pages that are crawled but not indexed. If you see hundreds of URLs with filter parameters, that's a bad sign. Also check server logs: how many requests does Googlebot make on filtered URLs?

Use the URL inspection tool to test a few combinations. If Google says "canonical URL differs from requested URL," that's good — it means your canonical is working. If all combinations are treated as canonical, there's a problem.

Audit filtered URLs crawled in Search Console
Identify filter combinations with high SEO potential vs. useless ones
Implement canonical tags to parent pages for combinations without value
Use noindex on redundant or empty filtered pages
Configure URL parameters in Search Console to guide Googlebot
Avoid blocking crawl in robots.txt if it cuts internal navigation paths
Regularly monitor server logs to detect excessive crawling
Test filtered pages with the URL inspection tool to validate canonicals

Faceted navigation is a powerful lever for user experience, but a major technical pitfall if not properly managed. Identifying relevant combinations, blocking indexation of superfluous content, and monitoring crawls are imperatives. These optimizations require a methodical approach and deep knowledge of site architecture — in many cases, working with a specialized SEO agency saves time and prevents costly crawl budget and ranking mistakes.

❓ Frequently Asked Questions

Dois-je bloquer toutes mes pages filtrées avec un noindex ?

Non. Certaines combinaisons de filtres ont un réel potentiel SEO et doivent rester indexables. L'enjeu est d'identifier lesquelles sont pertinentes (volume de recherche, intention utilisateur) et de bloquer uniquement les combinaisons redondantes ou vides.

Canonical ou noindex pour gérer les facettes ?

Ça dépend. Canonical si la page filtrée a du contenu similaire à une page mère que tu veux privilégier. Noindex si la page est vraiment sans intérêt et que tu veux éviter qu'elle soit indexée, tout en permettant le crawl des liens internes.

Est-ce que robots.txt suffit pour gérer le crawl des facettes ?

Non, et c'est même risqué. Bloquer dans robots.txt empêche Googlebot de crawler ces pages, donc de suivre les liens internes qui passent par elles. Tu risques de couper des chemins de crawl vers des pages importantes. Mieux vaut autoriser le crawl et gérer l'indexation avec canonical ou noindex.

Comment savoir si mes facettes posent problème ?

Regarde dans Search Console les pages crawlées mais non indexées. Si tu vois des centaines d'URLs avec des paramètres de filtres, c'est un signal clair. Analyse aussi tes logs serveur pour voir combien de requêtes Googlebot fait sur ces URLs.

Les paramètres d'URL dans Search Console sont-ils encore utiles ?

Oui, même si Google a réduit leur importance. Ils permettent de guider Googlebot sur le rôle de certains paramètres (tri, pagination, filtres). C'est un signal supplémentaire, mais ça ne remplace pas une configuration propre avec canonical et noindex.

🏷 Related Topics

navigation facettes crawl budget e-commerce SEO canonical noindex filtres produits architecture site

Crawl & Indexing E-commerce AI & SEO Domain Name Pagination & Structure Search Console

🎥 From the same video 10

Other SEO insights extracted from this same Google Search Central video · published on 03/02/2026

🎥 Watch the full video on YouTube →

Related statements

« Previous

Calendar parameters create infinite URL spaces...

Results Volatility Is Not Always a Matter of Updat...

« Back to results