Official statement
Other statements from this video 15 ▾
- 2:19 Faut-il indexer les pages de résultats de recherche interne de votre site ?
- 7:55 Faut-il absolument récupérer un ancien compte Search Console pour vérifier un site ?
- 12:38 Les liens provenant de sites autoritaires sont-ils vraiment plus puissants en SEO ?
- 17:58 Faut-il vraiment s'inquiéter des erreurs 404 sur son site ?
- 21:45 Google Trends suffit-il vraiment pour identifier les bons mots-clés ?
- 26:12 Les mentions légales impactent-elles vraiment le référencement naturel ?
- 28:26 Les erreurs 503 font-elles vraiment disparaître vos pages de Google ?
- 35:27 Peut-on changer de gamme de produits sans ruiner son référencement ?
- 37:25 Faut-il vraiment laisser Googlebot explorer vos URL paramétriques ?
- 39:07 Les liens de navigation dupliqués sur toutes les pages nuisent-ils vraiment au SEO ?
- 43:01 Google peut-il vraiment indexer vos modifications critiques en quelques minutes ?
- 45:58 Faut-il abandonner les hreflang en HTML au profit des sitemaps XML ?
- 47:32 Les overlays JavaScript sont-ils traités comme des interstitiels intrusifs par Google ?
- 48:49 Les réseaux sociaux influencent-ils réellement le classement Google ?
- 51:21 Le contenu UGC de faible qualité peut-il plomber le classement global de votre site ?
Google recommends marking internal results pages as noindex while allowing follow links. The goal is to let the robot discover destination pages without wasting crawl budget on pages without editorial value. This strategy helps optimize the flow of internal PageRank while avoiding pollution of the index with thousands of filter pages.
What you need to understand
Why does Google differentiate between noindex and nofollow on internal results pages?
The confusion originates from a time when noindex and nofollow were often used together by habit. However, these two directives serve different purposes. The noindex tells Google not to store the page in its index, while nofollow instructs it not to follow the links on that page.
On an internal search results page (filters, facets, sorting by price, etc.), the content of the page itself often has no value: it's a dynamic aggregation of products or articles already indexed elsewhere. But the links to product or article details that it contains are valuable for crawling.
What is the concrete mechanism of this strategy?
When you apply noindex, follow on an e-commerce filter page (e.g., "Red shoes size 42"), Googlebot reads the page, follows all the links it contains, but does not add this filter page to the index. Result: your product details are discovered quickly, without Google wasting time crawling thousands of filter combinations.
This is particularly powerful on large catalogs. A site with 500 products but 20,000 possible filter combinations can maintain a clean crawl budget by only allowing indexing of high-value pages. Google accesses products without getting bogged down in endless pagination.
Does this directive apply only to search results pages?
No. The logic applies to any type of intermediary page: pagination pages, sorting pages, pages with redundant tags, date archives on a blog. Whenever a page only acts as a navigation hub without providing unique content, noindex, follow is a relevant option.
However, caution is advised: if a category page contains original editorial content (optimized intro, FAQ, guides), then it probably deserves to be indexed. The rule is not absolute; it depends on the actual editorial value of each type of page.
- Noindex, follow allows for the decoupling of indexing and crawling: Google follows links without storing the page
- Particularly effective on low editorial value filter, facet, and pagination pages
- Preserves crawl budget by avoiding the indexing of thousands of redundant pages
- PageRank continues to flow to target pages via follow links
- Do not apply blindly: assess the editorial value of each type of page before deciding
SEO Expert opinion
Is this statement consistent with ground observations?
Yes, and it is even one of the few areas where Google is perfectly transparent. Tests conducted on medium-sized e-commerce sites (5,000 to 50,000 products) show that switching from noindex, nofollow to noindex, follow on filter pages accelerates the discovery of new products by 30 to 50% on average.
The historical trap was completely blocking crawl via robots.txt or nofollow, creating crawl orphans: products only discoverable via the XML sitemap, thus crawled with a significant delay. Allowing follow links resolves this issue without polluting the index.
In what cases does this rule not apply?
First exception: category pages with substantial editorial content. If you have written 800 words on "How to choose trail shoes", with a well-structured internal linking and relevant external links, that page belongs in the index. Noindex would be a mistake.
Second exception: sites with an extremely generous crawl budget. On a blog with 200 articles, Google crawls everything without effort already. Complicating directive management to gain 0.2% more crawl budget makes no sense. The noindex/follow strategy is designed for catalogs with thousands of pages.
Third nuance: some poorly configured CMSs interpret noindex, follow as a contradictory signal, causing erratic behaviors. [To be checked] on your technical stack before massive deployment. Test first on a sample of 50 to 100 pages.
What common mistake does this directive help avoid?
The classic mistake: blocking facets in robots.txt to "save crawl budget", while hoping Google still finds the products via the sitemap. Result: products are indexed but with a delay of several weeks and without benefiting from the PageRank passed by the category pages.
With noindex, follow, facets become effective relay pages: they guide Googlebot to product details without consuming indexing quota. It's an optimal balance between quick discovery and index cleanliness. No magic, just logic.
Practical impact and recommendations
What should you actually do on an e-commerce site?
Start by mapping your types of pages: main categories, subcategories, price filters, color filters, size, brand, sorting pages, pagination. For each type, ask yourself: "Does this page provide unique content or is it just a recombination of elements already indexed elsewhere?"
Then, apply the meta robots noindex, follow directive on types of pages with low value. In practice, this is often done through rules in the CMS or the template. On Shopify, PrestaShop, or WooCommerce, most SEO modules allow you to define these rules by URL pattern (e.g., any URL containing "?sort=" or "?filter=").
How can you verify that the strategy is working?
Monitor Google Search Console, Coverage section. Noindex pages should appear under “Excluded by noindex tag.” If they persist under “Indexed,” it means Google is ignoring the directive (aggressive caching, directive misplaced in HTML, conflict with XML sitemap).
Also, check the server logs: Googlebot should continue to crawl noindex pages at a reasonable frequency (a sign that it is properly following the links) but not indexing them. If crawl of these pages drops sharply, it means Google is treating them as blocked content, which is not the desired effect.
What mistakes to avoid during deployment?
Never apply noindex on a page that is already receiving qualified organic traffic. Before making any changes, export Search Console data to identify filter or pagination pages generating clicks. Some filter combinations capture profitable long-tail queries.
Also, avoid coupling noindex with a canonical pointing to another page. Google interprets this combination as contradictory: the canonical says, "this page is a duplicate of X," while noindex says, "do not store this page." Choose one or the other depending on context, never both simultaneously.
- Audit the types of pages and identify those without unique editorial value
- Implement noindex, follow via meta robots tag (not via X-Robots-Tag HTTP if possible, for better control)
- Exclude these pages from the XML sitemap for consistency (Google generally does not index what’s in the sitemap with noindex, but it's best to avoid mixed signals)
- Check in Search Console that the pages are indeed marked as “Excluded by noindex” within 2 to 4 weeks
- Monitor logs to confirm that Googlebot continues to crawl these pages regularly
- Test first on a reduced sample (50-100 pages) before large-scale deployment
❓ Frequently Asked Questions
Peut-on utiliser noindex, follow sur toutes les pages de pagination ?
Faut-il retirer les pages noindex du sitemap XML ?
Le PageRank circule-t-il vraiment via les liens en follow sur une page noindex ?
Que faire si une page de filtre génère déjà du trafic organique ?
Cette stratégie fonctionne-t-elle sur des sites hors e-commerce ?
🎥 From the same video 15
Other SEO insights extracted from this same Google Search Central video · duration 57 min · published on 23/09/2016
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.