What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

For an established site growing slowly, it's normal for 97% of crawl to be refresh and only 3% discovery. This is different for news sites or classifieds with lots of new content.
🎥 Source video

Extracted from a Google Search Central video

💬 EN 📅 18/02/2022 ✂ 24 statements
Watch on YouTube →
Other statements from this video 23
  1. Google compte-t-il vraiment tous les liens visibles dans Search Console ?
  2. Faut-il vraiment concentrer son contenu sur moins de pages pour ranker ?
  3. Les critères d'avis produits Google s'appliquent-ils même si votre site n'est pas classé comme site d'avis ?
  4. L'API Indexing de Google fonctionne-t-elle vraiment pour tous les contenus ?
  5. L'E-A-T influence-t-il vraiment le classement Google ou n'est-ce qu'un mythe ?
  6. Les mentions de marque sans lien ont-elles un impact sur votre référencement ?
  7. Les commentaires d'utilisateurs améliorent-ils vraiment le classement dans Google ?
  8. Les certificats SSL premium influencent-ils vraiment le référencement Google ?
  9. PDF et HTML avec le même contenu : faut-il craindre une cannibalisation dans les SERPs ?
  10. Peut-on vraiment piloter l'indexation des PDF via les headers HTTP ?
  11. Faut-il encore utiliser rel=next et rel=prev pour la pagination ?
  12. Googlebot peut-il vraiment indexer vos contenus en défilement infini ?
  13. Faut-il vraiment indexer toutes les pages de son site ?
  14. Faut-il s'inquiéter de la page référente affichée dans Google Search Console ?
  15. Faut-il vraiment rediriger l'ancien sitemap en 301 ou soumettre le nouveau directement ?
  16. Comment Google détermine-t-il réellement la vitesse de crawl de votre site ?
  17. Vitesse de crawl et Core Web Vitals : pourquoi Google fait-il la distinction ?
  18. Pourquoi Google ralentit-il son crawl après un changement d'hébergement ?
  19. Le paramètre de taux de crawl est-il vraiment un plafond et non un objectif ?
  20. Le CTR peut-il vraiment pénaliser le reste de votre site ?
  21. Le maillage interne est-il vraiment l'élément le plus déterminant pour le SEO ?
  22. Le linking interne agit-il vraiment instantanément après recrawl ?
  23. Faut-il s'inquiéter si Google ne crawle pas toutes vos pages ?
📅
Official statement from (4 years ago)
TL;DR

Google confirms that an established site with normal organic growth typically displays 97% crawl refresh versus only 3% discovery. This distribution is not a problem — it's a sign of maturity. News sites or classifieds with massive streams of new content show inverted ratios, but this isn't the norm for most websites.

What you need to understand

What's the difference between crawl refresh and discovery crawl?

The crawl refresh refers to Googlebot's passage over already known and indexed URLs to check whether content has changed. Discovery crawl, meanwhile, concerns new URLs never visited before.

John Mueller clarifies that a 97/3 ratio is normal for an established site that evolves gradually. Many SEOs panic at these figures in Search Console, thinking Google is ignoring their new content. This is a misreading.

Why does this ratio vary depending on site type?

A news site or classifieds generates hundreds or thousands of URLs daily. The ratio then tilts toward discovery — logically, since the volume of new content explodes.

A corporate site, standard e-commerce platform, or blog adding 10-20 pages per month? The bulk of crawl budget focuses on refreshing existing content. It's mathematical: 500 established pages spread over 30 days = lots of refresh, little discovery.

Does this distribution mean my new content is being ignored?

No. 3% discovery on a site crawled 10,000 times per week = 300 crawls of new URLs. More than enough to index your new articles or product pages.

The problem doesn't come from the ratio but from signal quality: poor internal linking, outdated XML sitemap, orphaned content. If your new pages take 3 weeks to index, it's not because of the refresh/discovery ratio.

  • A 97/3 ratio is normal and healthy for an established site with moderate growth
  • High-volume sites (news, classifieds) show inverted ratios — but these are special cases
  • Crawl refresh allows Google to detect content updates on existing pages, which remains critical for freshness
  • A refresh-heavy ratio doesn't prevent rapid indexing if your site architecture is clean

SEO Expert opinion

Is this statement consistent with what we observe in the field?

Yes, completely. Clients who contact me in panic with a 95/5 ratio often have a 200-300 page site publishing 2 articles per week. They compare their metrics to a media outlet pushing 50 pieces daily — an absurd comparison.

The problem is that Google provides no benchmarked figures in Search Console. SEOs interpret this data without context and jump to hasty conclusions. Mueller finally provides a frame of reference.

What nuances should we apply to this rule?

An established site that suddenly adds 500 pages (redesign, new section, migration) should see its ratio temporarily shift. If it doesn't budge, that's where you have a problem — poor linking, unsubmitted sitemap, crawl budget saturated by useless facets.

Another point: Mueller talks about a "slowly growing site." What does "slowly" mean? [To verify] — no precise metric. 10 pages/month? 50? The wording remains vague.

Finally, this statement says nothing about refresh frequency. A site crawled 50 times/day for refresh is excellent. A site crawled twice per week with 97% refresh might signal Google's disinterest in your content. The ratio alone isn't enough.

In which cases should this ratio alert us?

If you launch a major content initiative (100+ new pages) and the ratio stays stuck at 97/3 after 4 weeks, dig deeper. Either your internal linking fails or your new URLs are technical (filters, pagination) and Google marks them as low-value.

Warning: A low discovery ratio combined with declining total crawl volume may signal a perceived quality issue with your site. Analyze the total crawl volume in parallel, not just the distribution.

Practical impact and recommendations

What should you concretely do with this information?

First, stop panicking over a 95/5 ratio if your site publishes 3 pages per week. It's normal. Focus instead on the average indexation lag of your new URLs — a far more actionable metric.

Next, verify your new content is properly linked from frequently crawled pages (homepage, category hubs). An orphaned page will never appear in discovery crawl, no matter its quality.

What mistakes should you avoid after this statement?

Don't try to "force" a higher discovery ratio by publishing low-grade content just to drive volume. Google will lower its crawl budget allocation if you flood the index with weak pages.

Also avoid blocking crawl refresh via robots.txt or meta noindex on established pages to "free up budget." Refresh lets Google catch your updates — it's a freshness signal essential, especially on commercial pages.

How do you optimize crawl to maximize useful discovery?

Submit your new content via the Indexing API (if eligible) or at least via a daily-updated XML sitemap. Add a "Latest articles" block on your homepage or main menu to guarantee a direct link.

Monitor server logs to spot URLs crawled in loops with no value (facets, session parameters, duplicates). Block them properly to concentrate budget on strategic content.

  • Check your site's refresh/discovery ratio in Search Console
  • Compare this ratio to your actual publishing pace (pages/month)
  • Analyze average indexation lag for new URLs — if >7 days, investigate internal linking
  • Ensure your XML sitemap updates automatically with each new page
  • Audit server logs to spot useless crawls (facets, duplicates, parameters)
  • Deploy internal links from high-crawl pages to new content
  • Avoid blocking refresh on established pages — it's counterproductive
A 97% refresh / 3% discovery ratio is normal for an established site with moderate publishing. Rather than trying to artificially flip this ratio, focus on crawl efficiency: solid internal linking, fresh sitemap, cleaned logs. If orchestrating this technical optimization alone feels overwhelming — between log analysis, useless URL cleanup, and linking architecture — partnering with a specialized SEO agency can save you months and prevent costly crawl budget mistakes.

❓ Frequently Asked Questions

Un ratio 99% refresh / 1% découverte est-il inquiétant ?
Pas forcément. Si votre site publie 1-2 pages par mois sur un total de 500 pages établies, ce ratio est cohérent. Vérifiez plutôt que ces nouvelles pages s'indexent bien sous 7 jours.
Comment calculer le ratio refresh/découverte de mon site ?
Dans la Search Console, section 'Statistiques d'exploration', filtrez par 'Type de réponse' et comparez les volumes 'Crawl de rafraîchissement' vs 'Crawl de découverte'. Le ratio se calcule sur le total des requêtes.
Faut-il bloquer le crawl refresh pour libérer du budget découverte ?
Non, c'est contre-productif. Le refresh permet à Google de capter vos mises à jour de contenu et signaux de fraîcheur. Bloquer le refresh dégrade la perception de votre site.
Les sites e-commerce doivent-ils avoir un ratio différent ?
Ça dépend. Un catalogue stable avec peu de nouveaux produits affichera un ratio proche de 97/3. Un site avec lancements quotidiens de références pourra voir 70/30 voire 50/50 — c'est contextuel.
Comment forcer Google à crawler mes nouvelles pages plus vite ?
Maillage interne depuis la homepage ou catégories principales, sitemap XML à jour, soumission via API Indexing si éligible. Le ratio refresh/découverte s'ajustera naturellement si l'architecture est propre.
🏷 Related Topics
Content Crawl & Indexing Pagination & Structure

🎥 From the same video 23

Other SEO insights extracted from this same Google Search Central video · published on 18/02/2022

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.