Is a 97% crawl refresh rate actually a positive sign for your website's health?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

For an established site growing slowly, it's normal for 97% of crawl to be refresh and only 3% discovery. This is different for news sites or classifieds with lots of new content.

🎥 Source video

Extracted from a Google Search Central video

💬 EN 📅 18/02/2022 ✂ 24 statements

Watch on YouTube →

✂ Other statements from this video 23 ▾

📅

Official statement from February 18, 2022 (4 years ago)

⚠ A more recent statement exists on this topic Does Google Merchant Center crawling count against your SEO crawl budget? John Mueller · April 30, 2024 View statement →

TL;DR

Google confirms that an established site with normal organic growth typically displays 97% crawl refresh versus only 3% discovery. This distribution is not a problem — it's a sign of maturity. News sites or classifieds with massive streams of new content show inverted ratios, but this isn't the norm for most websites.

What you need to understand

What's the difference between crawl refresh and discovery crawl?

The crawl refresh refers to Googlebot's passage over already known and indexed URLs to check whether content has changed. Discovery crawl, meanwhile, concerns new URLs never visited before.

John Mueller clarifies that a 97/3 ratio is normal for an established site that evolves gradually. Many SEOs panic at these figures in Search Console, thinking Google is ignoring their new content. This is a misreading.

Why does this ratio vary depending on site type?

A news site or classifieds generates hundreds or thousands of URLs daily. The ratio then tilts toward discovery — logically, since the volume of new content explodes.

A corporate site, standard e-commerce platform, or blog adding 10-20 pages per month? The bulk of crawl budget focuses on refreshing existing content. It's mathematical: 500 established pages spread over 30 days = lots of refresh, little discovery.

Does this distribution mean my new content is being ignored?

No. 3% discovery on a site crawled 10,000 times per week = 300 crawls of new URLs. More than enough to index your new articles or product pages.

The problem doesn't come from the ratio but from signal quality: poor internal linking, outdated XML sitemap, orphaned content. If your new pages take 3 weeks to index, it's not because of the refresh/discovery ratio.

A 97/3 ratio is normal and healthy for an established site with moderate growth
High-volume sites (news, classifieds) show inverted ratios — but these are special cases
Crawl refresh allows Google to detect content updates on existing pages, which remains critical for freshness
A refresh-heavy ratio doesn't prevent rapid indexing if your site architecture is clean

SEO Expert opinion

Is this statement consistent with what we observe in the field?

Yes, completely. Clients who contact me in panic with a 95/5 ratio often have a 200-300 page site publishing 2 articles per week. They compare their metrics to a media outlet pushing 50 pieces daily — an absurd comparison.

The problem is that Google provides no benchmarked figures in Search Console. SEOs interpret this data without context and jump to hasty conclusions. Mueller finally provides a frame of reference.

What nuances should we apply to this rule?

An established site that suddenly adds 500 pages (redesign, new section, migration) should see its ratio temporarily shift. If it doesn't budge, that's where you have a problem — poor linking, unsubmitted sitemap, crawl budget saturated by useless facets.

Another point: Mueller talks about a "slowly growing site." What does "slowly" mean? [To verify] — no precise metric. 10 pages/month? 50? The wording remains vague.

Finally, this statement says nothing about refresh frequency. A site crawled 50 times/day for refresh is excellent. A site crawled twice per week with 97% refresh might signal Google's disinterest in your content. The ratio alone isn't enough.

In which cases should this ratio alert us?

If you launch a major content initiative (100+ new pages) and the ratio stays stuck at 97/3 after 4 weeks, dig deeper. Either your internal linking fails or your new URLs are technical (filters, pagination) and Google marks them as low-value.

Warning: A low discovery ratio combined with declining total crawl volume may signal a perceived quality issue with your site. Analyze the total crawl volume in parallel, not just the distribution.

Practical impact and recommendations

What should you concretely do with this information?

First, stop panicking over a 95/5 ratio if your site publishes 3 pages per week. It's normal. Focus instead on the average indexation lag of your new URLs — a far more actionable metric.

Next, verify your new content is properly linked from frequently crawled pages (homepage, category hubs). An orphaned page will never appear in discovery crawl, no matter its quality.

What mistakes should you avoid after this statement?

Don't try to "force" a higher discovery ratio by publishing low-grade content just to drive volume. Google will lower its crawl budget allocation if you flood the index with weak pages.

Also avoid blocking crawl refresh via robots.txt or meta noindex on established pages to "free up budget." Refresh lets Google catch your updates — it's a freshness signal essential, especially on commercial pages.

How do you optimize crawl to maximize useful discovery?

Submit your new content via the Indexing API (if eligible) or at least via a daily-updated XML sitemap. Add a "Latest articles" block on your homepage or main menu to guarantee a direct link.

Monitor server logs to spot URLs crawled in loops with no value (facets, session parameters, duplicates). Block them properly to concentrate budget on strategic content.

Check your site's refresh/discovery ratio in Search Console
Compare this ratio to your actual publishing pace (pages/month)
Analyze average indexation lag for new URLs — if >7 days, investigate internal linking
Ensure your XML sitemap updates automatically with each new page
Audit server logs to spot useless crawls (facets, duplicates, parameters)
Deploy internal links from high-crawl pages to new content
Avoid blocking refresh on established pages — it's counterproductive

A 97% refresh / 3% discovery ratio is normal for an established site with moderate publishing. Rather than trying to artificially flip this ratio, focus on crawl efficiency: solid internal linking, fresh sitemap, cleaned logs. If orchestrating this technical optimization alone feels overwhelming — between log analysis, useless URL cleanup, and linking architecture — partnering with a specialized SEO agency can save you months and prevent costly crawl budget mistakes.

❓ Frequently Asked Questions

Un ratio 99% refresh / 1% découverte est-il inquiétant ?

Pas forcément. Si votre site publie 1-2 pages par mois sur un total de 500 pages établies, ce ratio est cohérent. Vérifiez plutôt que ces nouvelles pages s'indexent bien sous 7 jours.

Comment calculer le ratio refresh/découverte de mon site ?

Dans la Search Console, section 'Statistiques d'exploration', filtrez par 'Type de réponse' et comparez les volumes 'Crawl de rafraîchissement' vs 'Crawl de découverte'. Le ratio se calcule sur le total des requêtes.

Faut-il bloquer le crawl refresh pour libérer du budget découverte ?

Non, c'est contre-productif. Le refresh permet à Google de capter vos mises à jour de contenu et signaux de fraîcheur. Bloquer le refresh dégrade la perception de votre site.

Les sites e-commerce doivent-ils avoir un ratio différent ?

Ça dépend. Un catalogue stable avec peu de nouveaux produits affichera un ratio proche de 97/3. Un site avec lancements quotidiens de références pourra voir 70/30 voire 50/50 — c'est contextuel.

Comment forcer Google à crawler mes nouvelles pages plus vite ?

Maillage interne depuis la homepage ou catégories principales, sitemap XML à jour, soumission via API Indexing si éligible. Le ratio refresh/découverte s'ajustera naturellement si l'architecture est propre.

🏷 Related Topics

crawl budget crawl refresh indexation Googlebot Search Console logs serveur maillage interne

Content Crawl & Indexing Pagination & Structure

🎥 From the same video 23

Other SEO insights extracted from this same Google Search Central video · published on 18/02/2022

🎥 Watch the full video on YouTube →

Related statements

« Previous

PDF/HTML Duplication: Not Duplicate Content but Po...

All visible links to your website are counted...

« Back to results