Does Google really crawl some URLs only once every six months?

Official statement

Google classifies URLs according to the presumed frequency of their changes. URLs that are rarely modified may be crawled once every six months.

29:51

🎥 Source video

Extracted from a Google Search Central video

⏱ 52:23 💬 EN 📅 11/07/2019 ✂ 13 statements

Watch on YouTube (29:51) →

✂ Other statements from this video 12 ▾

2:33 Les emojis dans les meta descriptions sont-ils un levier SEO ou un gadget inutile ?
5:18 Faut-il vraiment pointer le canonical vers la version desktop en mobile-first ?
11:35 Faut-il vraiment corriger toutes les erreurs 404 sur son site ?
15:01 Pourquoi les clics totaux dans la Search Console ne correspondent-ils jamais à la somme des clics par requête ?
15:04 Pourquoi vos rich snippets disparaissent sans affecter votre confiance de domaine ?
16:58 Les échanges de liens systématiques sont-ils vraiment détectés par les algorithmes de Google ?
22:12 Peut-on indexer des pages vides si elles apportent de la valeur utilisateur ?
24:10 Faut-il vraiment éviter de réutiliser une URL pour mettre à jour un article Google News ?
28:46 Pourquoi Google tarde-t-il autant à reconnaître une balise canonical corrigée ?
31:40 Votre sitemap peut-il vraiment tuer votre crawl budget ?
39:47 Faut-il vraiment privilégier le code 410 au 404 pour accélérer le désindexation ?
41:14 Google Search Console utilise-t-il une version obsolète de Chrome pour le rendu ?

What you need to understand

Does Google crawl all pages at the same frequency?

No. Google adjusts its crawl frequency based on what it assumes to be a URL's stability. If a page is perceived as rarely modified, it will be visited less frequently — sometimes only once every six months.

This classification relies on predictive signals: modification history, type of content, position in the site hierarchy, freshness signals. Google does not guess randomly — it infers from its previous visits and behavioral analysis of the site.

What triggers more frequent recrawling?

URLs that change regularly, generate traffic, or belong to strategic sections (blog, news, updated product pages) are crawled more often. Google also monitors freshness signals: lastmod tag in the XML sitemap, date mentions in the content, update patterns.

Conversely, a technical page buried in the hierarchy, with no traffic or changes for months, will be relegated to the bottom of the pile. Google conserves its crawl budget by prioritizing what moves and what matters.

Why does this logic pose problems for some sites?

Because it creates a vicious cycle: a poorly crawled page does not get updated in the index, so it does not rise in search results, thus it generates no traffic, leading Google to crawl it even less.

This is particularly problematic for sites that update existing content without creating new URLs. If Google does not come back, the SEO improvements are never indexed. The result: wasted effort.

Google classifies URLs based on their presumed modification frequency, not their actual importance
Some pages may be crawled only once every six months if they are deemed stable
Freshness signals (sitemap, lastmod, dated content) directly influence crawl frequency
A poorly crawled page enters a vicious cycle: no indexing = no traffic = less crawling
The crawl budget is a limited resource: Google optimizes its visits based on its predictions

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it's even an understatement. On sites with deep hierarchies or low-dynamic product catalogs, we regularly observe aberrant indexing delays — sometimes several months for a simple price or description update.

John Mueller is stating the obvious here, but he deserves credit for saying it clearly: Google does not crawl everything, all the time. What is missing from this statement is a precise definition of the classification criteria. What causes a URL to switch from weekly to semi-annual crawling? [To be verified] due to the lack of publicly available actionable data.

What nuances should be added to this rule?

First, this logic mainly applies to sites with a tight crawl budget — which means most sites. An authoritative site with good internal linking and regular backlinks will have less of this issue: Google is more generous with resources.

Then, there are ways to push the issue: submitting the URL via Search Console, including a recent lastmod in the XML sitemap, generating organic or referral traffic to the page. But these maneuvers are not officially documented, so they rely on field experience and empirical observation.

In which cases does this rule not apply?

For news sites or fresh content platforms (media, active blogs), Google automatically increases its crawl frequency. The signal of editorial freshness is detected and rewarded.

Similarly, sites with a high rate of technical updates (e-commerce with stock rotation, classified ad aggregators) benefit from more frequent crawling — provided that the XML sitemap is well configured and freshness signals are sent correctly. Otherwise, even a dynamic site can end up being under-crawled.

Warning: If you deploy significant SEO changes (title tag redesign, revised internal linking, on-page optimizations), and Google does not recrawl your URLs for several weeks, your efforts are invisible in the index. Always check the last visit date of Googlebot in your server logs.

Practical impact and recommendations

What should be done concretely to speed up crawling?

First action: submit modified URLs via the URL Inspection tool in Search Console. It is the most direct way to force an immediate recrawl. Don't just wait for Google to come by itself.

Next, optimize your XML sitemap by including only strategic URLs, with up-to-date lastmod tags. A sitemap polluted by thousands of outdated URLs dilutes the signal and slows down crawling. The fewer URLs there are, the more responsive Google is to those that remain.

What mistakes should be avoided to prevent worsening the problem?

Do not multiply unnecessary URLs: sorting parameters, filters, session variations. Every URL consumes crawl budget. If Google spends its time on duplicates or pages without value, it crawls your strategic content less often.

Also avoid leaving orphan pages without internal linking. A URL that is invisible in the hierarchy will only be visited if Google finds it elsewhere (sitemap, external backlink). This can take months. Always link your updated content from active pages.

How can I check if my site is being crawled correctly?

Check your server logs: identify the URLs crawled by Googlebot, their frequency, response codes. A tool like Screaming Frog Log Analyzer or OnCrawl allows you to cross-reference crawl data with your site hierarchy and sitemap.

In Search Console, monitor the coverage report and exploration statistics. If you see URLs “discovered but not crawled” that stagnate for weeks, it's a sign that Google has deprioritized them. Act accordingly.

Manually submit modified URLs via the Search Console inspection tool
Optimize the XML sitemap: only include strategic URLs, with up-to-date lastmod tags
Strengthen internal linking to updated content to increase their visibility in the hierarchy
Analyze server logs to identify under-crawled URLs and adjust your strategy
Avoid multiplying unnecessary URLs (parameters, filters, duplicates) that dilute crawl budget
Monitor the coverage report in Search Console to detect “discovered but not crawled” URLs

In summary: do not rely on Google to guess your priorities. Force its hand with clear signals (sitemap, linking, manual submissions) and monitor your logs to confirm that your strategic content is being crawled effectively. If you manage a complex site with hundreds of thousands of URLs, these optimizations can become technical and time-consuming. In this case, partnering with a specialized SEO agency will help streamline these processes and avoid costly indexing errors.

❓ Frequently Asked Questions

Pourquoi Google ne crawle-t-il pas toutes mes pages chaque jour ?

Google dispose d'un budget de crawl limité par site. Il priorise les URLs qui changent régulièrement, génèrent du trafic ou appartiennent à des sections stratégiques. Les pages jugées stables sont visitées moins souvent, parfois une fois par semestre.

Comment savoir si une URL est sous-crawlée ?

Consultez vos logs serveur pour voir la date de dernier passage de Googlebot. Dans la Search Console, le rapport de couverture indique les URLs « découvertes mais non explorées ». Si ce statut persiste plusieurs semaines, la page est déprioritisée.

La balise lastmod dans le sitemap XML a-t-elle vraiment un impact ?

Oui, c'est un signal de fraîcheur que Google utilise pour ajuster sa fréquence de crawl. Un lastmod récent et cohérent avec les modifications réelles incite Google à recrawler plus vite. Mais attention : un lastmod trompeur peut dégrader la confiance.

Soumettre une URL manuellement dans la Search Console force-t-il un recrawl immédiat ?

Généralement oui, dans les heures qui suivent. C'est le moyen le plus direct de forcer Google à revisiter une page. Mais cette méthode n'est pas scalable sur des milliers d'URLs : réservez-la aux contenus stratégiques ou récemment modifiés.

Un site avec beaucoup d'URLs inutiles est-il pénalisé en crawl ?

Pas directement pénalisé, mais le budget de crawl est dilué. Google perd du temps sur des pages sans valeur (paramètres, doublons, filtres) et crawle moins souvent vos contenus stratégiques. Nettoyer l'arborescence améliore la réactivité du crawl.

🎥 From the same video 12

Other SEO insights extracted from this same Google Search Central video · duration 52 min · published on 11/07/2019

🎥 Watch the full video on YouTube →