What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

The 'Cached' option shows a cached version of the page by us, controllable via the noarchive meta tag. 'Similar' shows other pages identified as similar by our algorithms.
2:19
🎥 Source video

Extracted from a Google Search Central video

⏱ 1h05 💬 EN 📅 20/10/2017 ✂ 29 statements
Watch on YouTube (2:19) →
Other statements from this video 28
  1. 1:05 Les guides de style Google influencent-ils vraiment le classement SEO de votre site ?
  2. 1:05 Les guides de style de Google pour développeurs influencent-ils vraiment votre SEO ?
  3. 2:19 Cache et Similaire sur Google : pourquoi cette distinction change-t-elle votre stratégie SEO ?
  4. 4:55 Pourquoi faut-il plusieurs mois pour qu'une amélioration de contenu impacte le classement ?
  5. 4:58 Combien de temps faut-il vraiment pour que Google réévalue la qualité d'un contenu ?
  6. 6:24 La popularité de marque influence-t-elle vraiment le classement Google ?
  7. 6:25 La popularité de marque influence-t-elle vraiment le classement Google ?
  8. 9:44 Faut-il supprimer ou noindexer les contenus dupliqués détectés par Panda ?
  9. 10:46 Le texte d'ancre précis booste-t-il vraiment votre SEO plus qu'une ancre générique ?
  10. 11:20 La vitesse de chargement est-elle vraiment un facteur de classement ou juste un mythe SEO ?
  11. 13:20 La vitesse de chargement est-elle vraiment un critère de classement SEO décisif ?
  12. 15:02 Le contenu sous onglets est-il vraiment indexé par Google en mobile-first ?
  13. 15:28 Le contenu masqué dans les onglets est-il vraiment indexé en mobile-first ?
  14. 17:35 Comment Google indexe-t-il réellement les produits identiques sur plusieurs URL ?
  15. 19:33 Faut-il vraiment contacter les webmasters avant de désavouer des backlinks toxiques ?
  16. 20:32 Faut-il vraiment utiliser l'outil de désaveu pour gérer les backlinks toxiques ?
  17. 24:17 Comment Google classe-t-il vraiment les pages de médias sociaux d'une marque dans ses résultats de recherche ?
  18. 26:56 L'indexation mobile fonctionne-t-elle vraiment avec les sites séparés m-dot et dynamiques ?
  19. 27:41 L'indexation mobile-first traite-t-elle vraiment tous les types de sites mobiles de la même manière ?
  20. 29:02 Comment Google ajuste-t-il réellement vos positions en temps réel ?
  21. 29:09 Les algorithmes de Google fonctionnent-ils vraiment en temps réel ?
  22. 30:18 Pourquoi la Search Console ne montre-t-elle qu'une fraction de vos backlinks réels ?
  23. 38:51 Les mauvais backlinks peuvent-ils vraiment pénaliser votre site ?
  24. 39:53 Les PBN sont-ils vraiment détectables par Google ou simple pari risqué ?
  25. 48:31 Faut-il vraiment ignorer les numéros de page dans vos URLs pour la pagination ?
  26. 50:34 Hreflang norvégien : faut-il vraiment privilégier NO-NO au lieu de NO-NB ?
  27. 52:37 Faut-il encore se soucier de l'échappement d'URLs pour le crawl JavaScript de Google ?
  28. 57:17 Google indexe-t-il vraiment tout le JavaScript d'un site web ?
📅
Official statement from (8 years ago)
TL;DR

Google offers two distinct features: the 'Cached' option displays a page's archived version by its servers, while 'Similar' reveals other pages deemed algorithmically close. The noarchive meta tag allows for disabling caching. These tools give SEOs direct control over the visibility of archived versions and an insight into the thematic clustering perceived by Google.

What you need to understand

What sets 'Cached' apart from 'Similar'?

The 'Cached' option provides access to a copy of the page stored by Google's servers during the last crawl. This frozen version serves as a reference when the site is down or when a page has been modified. It's a technical snapshot, not a semantic analysis.

The 'Similar' option relies on Google's clustering algorithms. It identifies other web pages sharing thematic, structural, or semantic characteristics. It reveals how Google categorizes content within its index.

How does the noarchive meta tag work?

The noarchive meta tag is inserted into the of the HTML page: <meta name="robots" content="noarchive">. It instructs Google not to provide the 'Cached' link in search results. The page remains indexed, crawled normally, but its history is no longer publicly accessible.

This directive applies to all bots that respect the robots meta tag standard. Google consistently adheres to it, unlike some optional directives. It's a binary control: either the cache is visible, or it isn't.

When should you use this feature?

Websites with dynamically changing content (prices, availability, news) should block the cache. Displaying outdated information can create confusion and degrade user experience. E-commerce platforms often hide their product listings to prevent outdated pricing from circulating.

Sensitive pages containing personal or confidential data also justify this directive. Even if the content is removed or modified, the cached version remains accessible for several days. This is a frequently overlooked information leakage vector.

  • Direct control over the display of archived versions via noarchive
  • Algorithmic clustering revealed by the 'Similar' option without the possibility of disabling it
  • Unchanged indexing: blocking the cache does not affect SEO
  • Immediate compliance with the directive after the next crawl
  • Free thematic diagnostics through analysis of suggested similar pages

SEO Expert opinion

Is this feature still relevant?

The removal of the 'Cached' link from Google’s public interfaces in 2024 makes this statement partially obsolete. The cache technically still exists, but public access has disappeared. The cache: operators still work for those in the know, but for how long?

The noarchive directive remains active and respected, even as its practical utility diminishes. For sites already using it, there’s no reason to remove it. For new projects, the decision becomes less clear. [To be verified]: Will Google officially communicate about the obsolescence of this tag?

Does the 'Similar' option truly reveal Google's clustering?

Yes, but with significant limitations. The suggestions reflect a simplified calculation, not the full clustering used for ranking. It's an indicator of thematic proximity, not an exhaustive mapping of competition.

Results can vary based on geographic context and the language of the interface. The same page may display different suggestions based on these parameters. Therefore, using this tool for competitive analysis demands stringent methodological precautions.

Can the 'Similar' option be disabled?

No. Unlike the cache, there is no directive in robots.txt or meta tag to block this feature. Google unilaterally decides which pages are similar, without an opt-out option.

This aligns with Google's logic: the cache is a replica of your content (thus controllable), while 'Similar' is an external analysis (therefore beyond your authority). This frustrating asymmetry reflects the engine's philosophy: you control your data, not its interpretation.

The reduced accessibility of the cache makes it difficult to verify compliance with noarchive. Test using the operator cache:yoururl.com in the search bar. If Google ignores the directive, a crawl bug or HTML syntax error is likely.

Practical impact and recommendations

Should you always implement noarchive?

No. The majority of sites have no reason to block the cache. It can even be counterproductive: in the event of a server outage, users lose access to your content through the archived version. It's a safety net that you destroy without benefit.

Reserve this directive for time-sensitive content (news, pricing, events) or requiring enhanced confidentiality. For everything else, let Google do its job. Cache visibility does not influence ranking or traffic.

How to audit suggested similar pages by Google?

Manually inspect your strategic pages by searching for their exact URL on Google, then clicking on 'Similar'. Note the patterns: direct competitors, affiliate sites, content aggregators. If low-quality pages appear, it's a signal that your thematic positioning lacks clarity.

Compile this data into a monthly tracking file. A sudden change in suggestions may indicate an algorithmic shift or editorial drift on your part. It's a free KPI, underutilized, to measure the semantic consistency perceived by Google.

What mistakes to avoid with noarchive?

Do not confuse noarchive with noindex. The former hides the cache, while the latter removes the page from the index. Mixing the two inadvertently disindexes entire content. Always check the syntax: content="noarchive" and not content="noarchive, noindex" if you only want to block the cache.

Avoid applying noarchive via robots.txt. This file controls crawling, not cache display. The directive must be in the HTML or HTTP headers (X-Robots-Tag: noarchive). This is a common mistake on multilingual sites where tags are duplicated without adaptation.

  • Check the HTML syntax of the noarchive meta tag in the <head>
  • Test with the cache: operator after sufficient crawl delay
  • Document affected pages in an SEO specifications file
  • Monitor 'Similar' suggestions on a sample of key pages monthly
  • Never apply noarchive by default across the site without justification
  • Ensure that HTTP headers do not conflict with meta tags
Managing cached versions and similar pages requires sharp technical expertise, especially on large-scale sites or complex architectures. If these optimizations exceed your internal resources or require an in-depth audit, a specialized SEO agency can provide an external perspective and tailored recommendations suited to your ecosystem.

❓ Frequently Asked Questions

La balise noarchive impacte-t-elle le référencement naturel ?
Non, elle n'a aucun effet sur l'indexation, le crawl ou le ranking. Elle contrôle uniquement l'affichage du lien "En cache" dans les résultats de recherche. Votre positionnement reste inchangé.
Peut-on appliquer noarchive uniquement à certaines sections d'une page ?
Non, la directive s'applique à l'intégralité de la page. Il n'existe pas de balise HTML pour masquer sélectivement des blocs de contenu du cache. C'est tout ou rien.
Les pages similaires suggérées changent-elles fréquemment ?
Oui, elles évoluent au rythme des mises à jour algorithmiques et de l'évolution du web. Une page peut voir ses suggestions varier mensuellement selon les nouveaux contenus indexés et les modifications de son propre contenu.
Comment forcer Google à mettre à jour le cache d'une page ?
Demandez un réindexage via la Search Console (outil Inspection d'URL). Le cache se rafraîchit lors du prochain crawl, généralement sous 24-48h pour les sites actifs. Aucune garantie de délai cependant.
L'option Similaire peut-elle révéler des contenus dupliqués ?
Parfois, mais ce n'est pas son objectif principal. Elle identifie des proximités thématiques, pas nécessairement du duplicate content. Si des copies exactes de votre contenu apparaissent, c'est un signal d'alerte à investiguer.
🏷 Related Topics
Algorithms Domain Age & History AI & SEO Web Performance

🎥 From the same video 28

Other SEO insights extracted from this same Google Search Central video · duration 1h05 · published on 20/10/2017

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.