Official statement
Other statements from this video 9 ▾
- 3:17 La vitesse mobile est-elle vraiment un facteur de classement qui change la donne ?
- 3:50 Pourquoi PageSpeed Insights intègre-t-il maintenant des données utilisateur réelles en plus des scores simulés ?
- 12:33 Faut-il mettre en noindex les pages panier vides de votre site e-commerce ?
- 14:35 Faut-il vraiment baliser chaque avis client individuellement en données structurées ?
- 35:10 Les balises canonical peuvent-elles bloquer l'indexation de vos pages stratégiques ?
- 65:00 Comment Google juge-t-il vraiment la qualité d'un site multilingue ?
- 71:20 Les plaintes DMCA peuvent-elles vraiment faire disparaître vos pages de Google ?
- 73:20 Google Search Console : pourquoi 16 mois de données changent-ils vraiment la donne pour votre SEO ?
- 80:00 PageSpeed Insights mesure-t-il vraiment la performance réelle de votre site ?
Google claims that comment fields without added value dilute a page's relevance and should not be indexed. For SEOs, this necessitates a review of the technical management of comment sections via meta robots or JavaScript obfuscation. The real challenge lies in defining the threshold for 'added value' and automating the sorting between useful and spammy comments.
What you need to understand
Why does Google care about comments on your pages?
The crawl budget and algorithmic relevance are at the heart of this statement. When an indexed page contains 200 generic comments like 'Great article!' or 'Thanks for sharing', the signal-to-noise ratio collapses. Google's semantic algorithms struggle to extract the main topic of the page.
Dilution works both ways. On one hand, editorial content loses relative weight in the overall HTML analysis. On the other hand, spammy keywords from comments can cause the page to rank for completely irrelevant queries, undermining the site's thematic coherence.
What qualifies as a 'non-added-value' comment for Google?
The phrasing remains vague, which is a problem. Google does not provide any quantitative criteria: minimum word count, density of subject-related keywords, or grammatical structure. An 8-word comment can be highly relevant (a documented factual correction), while a 150-word block can be pure spam.
In practice, likely signals include: length less than 10-15 words, absence of subject-specific vocabulary, high duplication rate (copy-pasted formula), presence of unnatural outbound links. However, nothing is officially confirmed, complicating any reliable sorting automation.
How does this directive impact actual indexing?
Technically, Google does not automatically de-index pages with weak comments. The statement invites webmasters to willingly exclude these sections from the index via robots tags, data-nosnippet attributes, or asynchronous post-render loading. It is an editorial responsibility, not a direct algorithmic penalty.
The main risk concerns sites with a high volume of comments (forums, media, e-commerce). A product page with 500 2-word reviews can see its internal PageRank diluted, its crawl time wasted, and its ability to rank for precise long-tail queries reduced. The engine spends more time analyzing noise rather than understanding the commercial signal.
- Semantic dilution: the ratio of editorial content to comments collapses, disrupting topical analysis
- Wasted crawl budget: on large sites, each URL with indexed comments consumes crawl resources
- Risk of off-topic ranking: spammy keywords in comments can attract unqualified traffic
- Internal PageRank impact: links in spam comments dilute link equity if not nofollow
- No official quantitative criteria: impossible to programmatically define a reliable 'added value' threshold
SEO Expert opinion
Does this statement align with real-world observations?
Yes, but with important sector-specific nuances. Niche sites with low comment volume generally do not see any negative impact, even with short contributions. Conversely, high-traffic media sites that have migrated their comments to iframes or asynchronous systems (Disqus, Coral, custom JavaScript solutions) often report a improvement in rankings for main queries.
The problem particularly arises on aging e-commerce sites where 10 years of unmanaged product comments create an indexing burden. Purging or de-indexing these sections frees crawl budget and clarifies the semantic profile of product listings. But be cautious: abruptly removing hundreds of thousands of URLs can lead to massive 404 errors and temporarily destabilize the site.
In what cases does this rule not really apply?
Specialized forums and Q&A sites (Stack Overflow, Reddit, Quora) represent a significant exception. Here, 'comments' (answers, threads) are the main content, not an appendix. Google treats them differently: each answer is analyzed as its own semantic entity, potentially eligible for featured snippets.
Authentic review sites (Trustpilot, G2, Capterra) also largely escape this directive. Even short 15-word reviews bring content freshness, trust signals (E-E-A-T), and real user vocabulary that enriches long-tail content. De-indexing these sections would essentially undermine a major competitive advantage.
What contradictions or gray areas remain?
Google does not clarify how its algorithms automatically detect 'added value'. [To be verified]: no official confirmation on the existence of a comment quality score in the ranking algorithm. The use of NLP models (sentiment analysis, informational density) is suspected, but this is reverse engineering, not official documentation.
Another gray area: the directive does not mention mixed comments. On a page with 80% weak comments and 20% dense, relevant contributions, what approach should be taken? De-indexing the entire block removes good comments. Keeping everything indexed dilutes the page. Hybrid solutions (selective indexing via microdata, dynamic scoring) are complex to implement and never publicly validated by Google.
Practical impact and recommendations
What should you do concretely on your existing pages?
Start with a quantitative audit of your comment sections. Extract via crawl or database the number of comments per page, their average length, and duplication rate. Identify pages where the comments/editorial content ratio exceeds 2:1 in word volume. These are your priorities.
For new comments, implement a threshold moderation system. Comments under 20 words or without subject-related vocabulary trigger an automatic 'noindex' flag via data-nosnippet attribute or exclusion from the initial HTML rendering. Comments validated manually or through NLP scoring remain indexable. Test this logic on a sample before a global rollout.
What technical errors should you absolutely avoid?
Never block comments via robots.txt. Google will not be able to crawl to understand that they exist, and you will lose all granular control. Instead, use meta robots 'noindex' tags on paginated comment URLs, or data-nosnippet on HTML blocks.
Avoid setting all your comments to nofollow by default if you have an engaged community. Contextual internal links in quality comments (citations from other articles, cross-references) hold real value for internal linking. Only non-editorial external links should be systematically nofollowed to prevent spam.
How to check the impact after modification?
Monitor in Search Console the evolution of the number of indexed pages after de-indexing weak comments. A sharp drop is normal, but ensure that the mother pages (articles, product listings) remain well indexed. Also check the Coverage report for any 404 errors if you've removed comment URLs.
On the ranking side, track positions on your main queries before and after. Wait 4-6 weeks to observe a stabilized impact, time for Google to recrawl and reevaluate cleaned pages. If you notice a drop, audit quickly: you may have accidentally de-indexed useful content or created orphans in the internal linking.
- Audit the editorial content/comments ratio across the entire site (crawl + DB)
- Define a minimal quality threshold (length, vocabulary, NLP scoring) for indexing
- Implement data-nosnippet or noindex meta robots on comments below the threshold
- Test on a sample of pages before a global rollout
- Monitor indexing evolution in Search Console (coverage, 404 errors)
- Track positions on main queries for 6 weeks post-modification
❓ Frequently Asked Questions
Faut-il supprimer physiquement les vieux commentaires non pertinents ou juste les désindexer ?
Les commentaires Facebook ou Disqus en iframe sont-ils automatiquement exclus de l'indexation ?
Un commentaire avec un lien interne vers un autre article du site a-t-il de la valeur pour le maillage ?
Comment automatiser la détection de commentaires « sans valeur » sans modération humaine intensive ?
Les avis produits de 2-3 mots sur un site e-commerce doivent-ils être désindexés ?
🎥 From the same video 9
Other SEO insights extracted from this same Google Search Central video · duration 1h04 · published on 26/01/2018
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.