Should you really set all category pages to noindex except for one?

Official statement

For the category pages on your site, allow one version to be indexed with the default sorting order and set the other variations to noindex. This improves the discovery of your products during our crawling.

39:06

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h00 💬 EN 📅 17/03/2020 ✂ 10 statements

Watch on YouTube (39:06) →

✂ Other statements from this video 9 ▾

4:50 Pourquoi votre contenu disparaît-il des résultats de recherche malgré une technique irréprochable ?
10:32 Pourquoi Google ne fournit-il aucune donnée Discover dans Analytics ?
17:28 Faut-il encore optimiser vos pages AMP avec le mobile-first indexing ?
25:53 Peut-on migrer un site multilingue sans implémenter hreflang immédiatement ?
29:05 Comment reprendre le contrôle de votre Search Console après une rupture avec votre agence SEO ?
35:15 Faut-il vraiment multiplier ou réduire vos pages produits pour le SEO ?
35:20 Faut-il vraiment créer une page par variante produit ou miser sur des pages consolidées ?
44:07 La vitesse de chargement est-elle vraiment un facteur de classement déterminant ?
47:08 Googlebot conserve-t-il vraiment les cookies entre les sessions de crawl ?

What you need to understand

Why does Google want to limit the indexing of category pages?

John Mueller's statement addresses a recurring issue on e-commerce sites: the proliferation of category URLs generated by filters and sorting options. Each variation (sorting by ascending price, descending, popularity, new arrivals) creates a distinct URL with almost identical content.

Google views these variations as internal duplicate content. Crawling and indexing all these versions dilutes the crawl budget and complicates the identification of the 'canonical' page to rank. The recommendation is to focus SEO juice on a single version — the one with the default sorting — to maximize its visibility.

What exactly is 'default' sorting?

Default sorting refers to the native display order of your products when a user arrives at a category without applying any filters. This could be sorted by algorithmic relevance, by new arrivals, or by best sellers — depending on your business logic.

The important thing is that this version is stable, consistent, and representative of the category. This is the version that Google should prioritize for indexing. All other variants (ascending price sort, descending, etc.) should have a meta robots noindex, follow tag so that Googlebot can follow the links to the products without indexing the page itself.

Does this rule apply to all types of sites?

No. The directive mainly targets large catalog e-commerce sites (thousands of products, hundreds of categories). For a site with 50 products and 10 categories, the issue does not even arise: the crawl budget is not a concern.

Conversely, on a marketplace with 100,000 listings and dozens of possible facets per category, the combinatorial explosion can generate millions of URLs. This is where selective noindex becomes strategic to avoid drowning Googlebot in redundant content.

Allow one indexable version per category (default sort)
Set all sorting variants to noindex (price, popularity, date, etc.)
Keep follow so that links to the products are followed
Use canonicals if variations are minor (but noindex is clearer)
Monitor crawl budget via Search Console to measure impact

SEO Expert opinion

Is this recommendation consistent with real-world observations?

Yes, generally. Audits of e-commerce sites consistently show an explosion of the number of indexed URLs related to sorting and filtering facets. Google crawls these pages, partially indexes them, and it creates noise in the index: orphan pages, cannibalization, dilution of internal PageRank.

But be careful: default sorting is not always the best strategic choice. Some sites benefit from indexing the 'best sellers' or 'new arrivals' version depending on their business positioning. Google refers to 'default,' but does not specify what that should be — [To be verified] based on your own conversion and organic traffic data.

What nuances should be made to this directive?

First, Google does not say that other sorts are useless for crawling. The noindex, follow tag allows SEO juice to pass to product pages without indexing the intermediary page. This is a crucial distinction: we want Googlebot to follow the links, but not to index the page.

Next, this logic only works if your internal linking is solid. If products are only accessible via specific sorts (for example, if a product is visible only under 'ascending price sort'), setting it to noindex renders it invisible to Google. Therefore, ensure all products are crawable via the indexed version.

Finally, some sites have facets that generate pages with real editorial value: unique descriptions, enriched content, specific search intents. In this case, it may be legitimate to index multiple variants — but this is the exception, not the rule.

When does this rule not apply?

On low-volume sites, the crawl budget is not an issue. There is no need to fuss with noindex if you have a total of 200 URLs. Google crawls everything without concern.

Similarly, if you generate filter pages with unique and optimized content for long-tail queries (e.g., 'waterproof hiking shoes for women'), it may be pertinent to index them — as long as they provide real value and aren't pure duplicate content. But this is a judgment to make on a case-by-case basis, not a general rule.

Note: Massively setting pages to noindex may temporarily cause your impressions in Search Console to drop. This is normal — you are de-indexing content. But if your strategy is sound, traffic will refocus on the indexed pages, which will increase in authority.

Practical impact and recommendations

What should you do specifically on an e-commerce site?

First, identify all category URLs generated by sorting and filtering parameters. Use a crawler like Screaming Frog or Oncrawl to map the extent of the problem. Then check how many of these pages are indexed via a site:example.com/category/ inurl:?sort= in Google.

Once the inventory is complete, define which version should remain indexable: the default sort. Technically, this means that the URL /category/shoes/ (without a parameter) is indexable, while /category/shoes/?sort=price_asc should be set to noindex.

Then implement the meta robots noindex, follow tag on all sorting variants. The 'follow' part is crucial: it allows Googlebot to crawl the links to the products without indexing the intermediary page. If you are using client-side JavaScript to handle sorts, ensure that the meta tag is present in the initial HTML, not injected afterward.

What mistakes should be avoided during implementation?

Classic mistake #1: putting pages to noindex without checking that all products remain crawlable via the indexed version. If a product only appears in a specific sort (e.g., new arrivals), it becomes invisible to Google once that page is set to noindex. Make sure your default sort displays all the products in the category, or set up complete pagination.

Classic mistake #2: confusing noindex with disallow in the robots.txt. The robots.txt blocks crawling, so Google never sees the noindex directive. The page must be crawlable for the noindex to be taken into account. Do not block sorting URLs in the robots.txt — keep them crawlable with noindex.

Finally, do not neglect tracking. After implementation, monitor the evolution of the number of indexed pages via Search Console, and ensure that the crawl budget is realigned towards high-value pages (product sheets, indexable categories). This process can take several weeks.

How can I check if my site complies with this recommendation?

Run a complete crawl and isolate all URLs containing sorting or filtering parameters. Check for the presence of the meta robots noindex, follow tag in the source code. Then compare with Google’s index via targeted site: queries.

In Search Console, check the Coverage report and filter by status 'Excluded by noindex tag'. You should see all your sorting pages there. If they still appear in 'Indexed', it means the directive is not properly implemented or not yet taken into account.

Map all category URLs with sorting/filter parameters
Define which version remains indexable (default sort = URL without parameter)
Implement meta robots noindex, follow on all variants
Ensure all products remain crawlable via the indexed version
Never block these URLs in the robots.txt (the noindex must be crawled)
Monitor the evolution of the number of indexed pages in Search Console

This technical optimization requires a detailed analysis of the site’s architecture and a rigorous implementation to avoid side effects. If you manage a large catalog with multiple facets, it may be wise to get support from a specialized SEO agency that understands these large-scale indexing issues and can audit your specific case.

❓ Frequently Asked Questions

Dois-je utiliser canonical ou noindex pour les pages de tri ?

Le noindex est plus clair et explicite : il dit à Google de ne pas indexer. Le canonical peut être ignoré si Google juge que les pages sont trop différentes. Pour les tris, le noindex, follow est la solution recommandée.

Que se passe-t-il si je bloque les URLs de tri dans le robots.txt ?

Google ne crawlera jamais ces pages, donc ne verra jamais la balise noindex. Les pages resteront potentiellement indexées (si elles l'étaient déjà) et Google ne pourra pas suivre les liens vers les produits. Mauvaise pratique.

Dois-je passer les filtres de prix ou de couleur en noindex aussi ?

Cela dépend. Si chaque combinaison de filtres génère une URL unique et du contenu dupliqué, oui. Mais si le filtre crée une page avec une valeur éditoriale réelle et une intention de recherche spécifique, l'indexer peut être pertinent. Arbitrage au cas par cas.

Combien de temps pour voir l'impact du noindex sur l'indexation ?

Google doit d'abord recrawler les pages pour voir la directive noindex. Cela peut prendre de quelques jours à plusieurs semaines selon la fréquence de crawl de votre site. Surveillez Search Console pour suivre la désindexation progressive.

Le tri par défaut doit-il être sans paramètre dans l'URL ?

Idéalement oui. L'URL /categorie/ sans paramètre est plus propre, plus facile à gérer, et évite toute ambiguïté. Si votre tri par défaut est /categorie/?sort=default, techniquement ça fonctionne, mais c'est moins optimal.

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 1h00 · published on 17/03/2020

🎥 Watch the full video on YouTube →