Official statement
Other statements from this video 8 ▾
- 2:06 Le fichier robots.txt est-il vraiment indispensable pour ranker sur Google ?
- 4:30 Google peut-il vraiment indexer vos pages sans les crawler ?
- 11:02 Comment Google hiérarchise-t-il vraiment les directives robots.txt ?
- 15:52 Faut-il bloquer les pages de filtres par robots.txt ou miser sur la canonicalisation ?
- 16:16 Faut-il vraiment corriger toutes les erreurs du fichier robots.txt ?
- 18:53 Les outils Search Console pour robots.txt sont-ils vraiment fiables pour éviter les erreurs de crawl ?
- 22:14 L'API Google Maps peut-elle bloquer l'indexation de vos données de localisation ?
- 52:55 Pourquoi bloquer des URLs en robots.txt dilue-t-il le PageRank de vos backlinks ?
Google does not adhere to the crawl-delay directive in the robots.txt file, unlike Bing and other search engines. This directive theoretically allows for slowing down the crawl rate, but Googlebot completely disregards it. SEOs relying on crawl-delay to manage server load or control crawl budget are missing out on more effective tools available to them.
What you need to understand
What is the crawl-delay directive and what is its purpose?
The crawl-delay directive is included in the robots.txt file and defines a minimum delay in seconds between two requests from the indexing bot. It was introduced to allow site owners to regulate server load during bot visits.
Specifically, a line like "Crawl-delay: 10" asks the bot to wait 10 seconds between each crawled URL. This is a protective mechanism against server overload caused by overly aggressive crawling, especially on older architectures or limited hosting.
Why do some search engines respect this directive while Google does not?
Bing, Yandex, and a few other engines honor the crawl-delay directive because they have historically adopted this specification. Bing officially documents it and adjusts its behavior accordingly.
Google has chosen a different path. The engine believes that Search Console provides more granular and effective tools for managing crawl rates. Mueller's stance is clear: crawl-delay has never been part of the robots.txt standard supported by Googlebot, and that won't change.
How does Google then manage crawl speed?
Googlebot automatically regulates its crawling rate based on the server health. If the site responds slowly or returns 5xx errors, the bot slows down spontaneously. It is an adaptive system that monitors response times and availability.
For sites wanting more control, Google provides a crawl rate limiting tool in Search Console. This control allows for setting an explicit limit, but Google recommends using it only in case of actual problems, not as a default setting.
- Crawl-delay has always been ignored by Googlebot; this is not a recent bug
- Bing and Yandex honor this directive, but not Google
- Google offers alternatives through Search Console and an automatic regulation system based on server health
- Placing crawl-delay in robots.txt for Google is useless and may create a false sense of security
- The directive has never been part of the official robots.txt standard supported by Google
SEO Expert opinion
Does Google's position hold up against real-world observations?
Yes, and it has been consistent for years. Empirical tests show that Googlebot has never slowed down its crawling in the presence of a crawl-delay directive, regardless of the specified value. No documented cases prove otherwise.
The real question is that many SEOs still deploy this directive thinking they are protecting their server. As a result, they miss out on optimal crawling on Bing and Yandex without gaining any benefits on Google's side. This is an inherited setting from outdated practices that has no place in a modern stack.
What limitations should we point out in this statement?
Mueller does not specify how effective the adaptive system of Googlebot actually is. On complex infrastructures with load balancing or CDNs, Google's detection of overload may be misleading. The bot sees a quick response from the CDN, but the origin server is struggling. [To be verified] in these distributed architectures.
Another blind spot: sites with millions of crawlable pages but limited server budget. The Search Console tool does allow for limiting crawl, but it lacks granularity. It is impossible to say "crawl this section quickly, that one slowly." You can only throttle everything or nothing, which is frustrating for segmented architectures.
Are there situations where crawl-delay remains relevant?
Yes, if your Bing or Yandex traffic is significant. A site with a strong presence in Russia or Asia should calibrate crawl-delay for these engines. Ignoring this directive just because Google doesn't consider it would be a strategic mistake.
For sites undergoing migration or during scheduled load peaks, it is better to temporarily block Googlebot via robots.txt (Disallow) or completely disable crawling in Search Console. Crawl-delay is ineffective; it's better to stop cleanly.
Practical impact and recommendations
What should you do if you have crawl-delay in your robots.txt?
The first step is to audit your robots.txt file and identify the presence of crawl-delay. If you are exclusively targeting Google, remove this directive immediately. It clutters the file without providing any benefit.
If you have significant traffic from Bing, Yandex, or other engines that respect this directive, calibrate the value based on your actual server capacity. A value of 1 to 5 seconds is generally sufficient to smooth out the load without sacrificing indexing speed too much.
How to effectively control Googlebot's crawl?
Use the crawl rate limiting tool in Search Console (Settings > Crawl rate). Google discourages adjusting it without good reason, but if your server shows signs of overheating during crawl peaks, this is the place to act.
Monitor your server logs: analyze the frequency of Googlebot hits, returned HTTP codes, and response times. If you see 503 errors or timeouts coinciding with the bot’s visits, it’s a signal to intervene. But be careful; throttling the crawl can mechanically slow down the discovery of your new pages.
What mistakes to avoid in crawl management?
Never rely solely on crawl-delay to protect your server from Google. It is a false sense of security that can mask real infrastructure problems. If your site cannot handle Googlebot's natural rhythm, it is a symptom of a technical weakness that needs to be addressed as a priority.
Avoid throttling crawl by default "just in case". Google already optimizes its behavior, and artificially limiting the crawl rate may delay the indexing of strategic content. Only act if the logs show a proven problem, not based on intuition.
- Remove crawl-delay from robots.txt if you are only targeting Google
- Keep and calibrate crawl-delay only if Bing/Yandex represent a significant volume
- Enable server log monitoring to detect crawl-related overloads
- Use the Search Console tool to limit crawl as a last resort, not as a preventive measure
- Check that your SEO audit tools (Screaming Frog, Botify) are not applying crawl-delay unbeknownst to you
- Address underlying infrastructure problems instead of masking them with crawl limitations
❓ Frequently Asked Questions
Est-ce que Googlebot a déjà respecté crawl-delay par le passé ?
Si je supprime crawl-delay, est-ce que Googlebot va saturer mon serveur ?
Bing respecte-t-il vraiment crawl-delay ou est-ce juste théorique ?
Peut-on combiner crawl-delay et l'outil de limitation dans Search Console ?
Comment savoir si mon serveur souffre réellement du crawl de Google ?
🎥 From the same video 8
Other SEO insights extracted from this same Google Search Central video · duration 55 min · published on 25/08/2015
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.