Does Google really ignore Cache-Control headers for crawling?

Official statement

Google generally ignores Cache-Control headers because many resources are under-cached. To force re-downloads, it's better to use versioned URLs with hash (e.g., app.abc123.js) rather than relying on Cache-Control.

33:47

🎥 Source video

Extracted from a Google Search Central video

⏱ 34:50 💬 EN 📅 27/05/2020 ✂ 13 statements

Watch on YouTube (33:47) →

✂ Other statements from this video 12 ▾

1:03 Le modèle first wave / second wave du rendu JavaScript est-il encore pertinent ?
3:42 Le contenu JavaScript rendu est-il vraiment indexable sans friction par Google ?
4:46 Le dynamic rendering avec accordéons dépliés est-il du cloaking selon Google ?
6:56 Faut-il vraiment abandonner le dynamic rendering au profit du server-side rendering ?
12:05 Le contenu caché derrière un accordéon ou un onglet est-il vraiment pris en compte par Google ?
13:07 Les liens JavaScript doivent-ils vraiment être des éléments <a> avec href pour être crawlés ?
14:11 Les PWA ont-elles vraiment un traitement SEO identique aux sites classiques ?
17:54 Faut-il arrêter d'utiliser Google Cache pour diagnostiquer vos problèmes d'indexation ?
21:07 Google peut-il vraiment ignorer une partie de votre site sans prévenir ?
23:14 Faut-il vraiment s'inquiéter d'un taux de crawl faible ?
26:52 Pourquoi Googlebot crawle-t-il encore en HTTP/1.1 et pas en HTTP/2 ?
27:23 Faut-il vraiment découper ses bundles JavaScript par section de site pour le SEO ?

What you need to understand

Why does Google ignore Cache-Control headers?

Martin Splitt's statement shakes a long-held belief: that Cache-Control directives would suffice to control when Googlebot should re-download a resource. The issue? A majority of sites set cache durations that are too long for their assets (CSS, JS, images).

Specifically, when a developer pushes an update for a file app.js but the header indicates max-age=31536000 (one year), Googlebot would theoretically respect this directive and keep the old version cached. The result: the bot crawls an outdated version of the site, misses content generated by JavaScript, or misinterprets the rendering.

Google has evidently noticed this pattern on a large scale. Rather than battling against faulty configurations, the team decided to simply ignore these headers in most cases. This is an indirect admission: we cannot trust webmasters to properly manage their caching policy.

What is hash versioning and why is it the solution?

Hash versioning (or fingerprinting) involves including a unique identifier in the filename: app.abc123.js where abc123 is a hash of the content. Each modification of the file generates a new hash, thus a new URL.

This approach solves the problem at its root. Since the URL changes with each deployment, there’s no ambiguity: Googlebot sees a new resource and downloads it. No need to check a Cache-Control header — the URL itself becomes the freshness signal.

Modern frameworks (Webpack, Vite, Next.js, etc.) implement this mechanism by default. However, many legacy sites or custom CMS have never set it up, continuing to serve files with static names and poorly calibrated cache headers.

What is the impact on JavaScript rendering and indexing?

If Googlebot crawls an outdated version of your scripts, it may miss critical content generated client-side. A module that dynamically loads blocks of text, lazy-loading images, a navigation menu rendered in React — all of this may never be seen by the bot.

The risk is particularly high on JavaScript-heavy sites (SPAs, React/Vue/Angular applications). If the main bundle file remains cached for weeks while you’ve deployed a redesign, Googlebot indexes the old version. Content changes, performance optimizations, SEO bug fixes — all get ignored.

This also applies to Core Web Vitals. An update that improves LCP or reduces unused JS will only be measured by Google if the bot actually loads the new version. Poorly managed cache can delay the recognition of your optimizations by several weeks.

Google ignores Cache-Control to avoid crawling outdated versions of resources
The hash versioning (e.g., app.abc123.js) is the recommended method to force re-downloads
Without versioning, the bot may index an outdated version of your site for weeks
Critical impact on JavaScript-heavy sites where the content relies on up-to-date bundles
Core Web Vitals may stagnate if Google doesn’t detect your performance improvements

SEO Expert opinion

Is this approach consistent with what we observe in the field?

Yes and no. On sites that already use automatic versioning (typically modern apps), we’ve never really had cache issues on Googlebot’s side. The problem mostly arises on legacy sites, custom CMS, or projects where the dev team isn’t trained in best practices.

Several empirical tests show that Googlebot can indeed crawl an outdated version of a JS file for several days after deployment if the URL hasn’t changed. The Mobile-Friendly Test or the URL Inspection tool sometimes display a rendering different from the live site — a classic symptom of this cache issue.

What’s less clear: Google says it generally ignores "Cache-Control", but doesn’t specify exceptions. Are there cases where Cache-Control is respected? What criteria are used? [To be verified] — the team doesn’t provide thresholds or specific rules.

Is hash versioning really the only solution?

It’s the most reliable solution, but not the only one. You can also manipulate query strings (app.js?v=123), although this is less elegant and some proxies/CDNs ignore them for caching. Another approach: force a re-crawl via the Indexing API (but limited to certain types of content).

The real issue is that many sites have no versioning system in place. Migrating to automatic hash requires a review of the build chain: integrating a modern bundler, modifying templates to inject hash URLs, configuring the CDN to serve the correct versions. On a 10-year-old legacy site, this can be a colossal task.

And then there's a blind spot: third-party resources. You don’t control the versioning of a Google Tag Manager script, an embedded video player, or a chat widget. If these resources are poorly cached, Googlebot might also crawl an outdated version — and here, you have no leverage.

What are the limits of this recommendation?

Splitt only addresses crawling, but what about client-side rendering for real users? If you set max-age=0 to force Google to re-download, you degrade performance for your visitors. Hash versioning allows resolving this tension: long cache + instant invalidation through URL change.

Another limit: this statement doesn’t mention images. We talk about JS and CSS, but what about graphic assets? Many sites serve images without versioning, using static URLs. If Google also ignores Cache-Control on these files, it could explain some delays in recognizing new images in Google Images.

Warning: Do not remove your Cache-Control headers just because Google ignores them. They remain essential for browsers and other search engines. The idea is not to rely on them to control Google crawling, but to maintain them for user experience.

Practical impact and recommendations

What should be done concretely to implement hash versioning?

The first step: audit your build chain. If you’re using Webpack, Vite, Parcel, Rollup, or Next.js, hash versioning is probably already enabled by default in production mode. Check your page source code to ensure that your asset URLs indeed contain a hash (e.g., app.3f2a1b.js).

If not, enable the option in your bundler configuration. In Webpack, it's output.filename: '[name].[contenthash].js'. In Vite, it’s native. Also, ensure your HTML (or your PHP/JSX/etc. templates) dynamically injects these hashed URLs — otherwise, you will be serving outdated paths.

For sites without a modern bundler (custom WordPress, proprietary CMS, static HTML), you’ll need to script hash generation. A Node.js script can calculate the MD5 of each file and rename the assets at deployment time. It’s more artisanal, but it works.

How to verify that Googlebot is crawling the correct versions?

Use the URL Inspection tool in Search Console. Request a live test, then examine the loaded resources in the "More info > Downloaded resources" tab. Compare the URLs and timestamps with what is deployed in production.

Another signal: server logs. Filter Googlebot requests and check which JS/CSS URLs are called. If you see paths without hashes or versions several weeks old, that’s a red flag. The bot may be using its internal cache despite your deployments.

Finally, test with the Mobile-Friendly Test or the Rich Results Test. These tools display the rendering as Googlebot sees it. If content generated by JavaScript is absent or outdated, it’s likely a cache issue with your bundles.

What mistakes to avoid when migrating to versioning?

A classic mistake: forgetting to update the paths in HTML. You generate app.abc123.js but your template continues to call app.js — resulting in a 404. Modern bundlers handle this automatically through HTML injection plugins, but in a custom setup, it must be done manually.

Another trap: CDN cache. If your CDN (Cloudflare, Fastly, etc.) aggressively caches the HTML pages themselves, users may receive outdated HTML that references already removed hashed assets from the server. Ensure that HTML cache is invalidated on each deployment, or use a short TTL.

Lastly, beware of Service Workers. If you’ve implemented a SW for offline mode, it needs to handle the new hashed URLs. Otherwise, returning users will be stuck on the cached old version, even if the server serves the new one.

Enable hash versioning in your bundler (Webpack, Vite, Rollup, etc.)
Ensure HTML templates dynamically inject hashed URLs
Test with the URL Inspection tool from Search Console after each deployment
Analyze server logs to detect if Googlebot loads outdated versions
Invalidate CDN cache on HTML pages with each update
Update Service Workers logic to handle new URLs

Switching to hash versioning is a technical project that impacts the build, deployment, CDN, and sometimes the CMS. On complex architectures or legacy sites, implementation can be challenging: integration with proprietary template systems, managing third-party dependencies, coordination between dev and ops teams. If your current setup still relies on static URLs and you lack internal resources to orchestrate this migration, it may be wise to seek help from a specialized technical SEO agency that understands these infrastructure issues and can audit, lead, and validate compliance without breaking the existing setup.

❓ Frequently Asked Questions

Google ignore-t-il aussi Cache-Control sur les images et les CSS ?

La déclaration de Martin Splitt mentionne les ressources en général, mais se concentre sur les scripts JS. En pratique, on observe le même comportement sur les CSS. Pour les images, c'est moins documenté — mais le versioning reste une bonne pratique pour forcer le re-crawl.

Le versioning par query string (app.js?v=123) est-il aussi efficace que le hash dans le nom de fichier ?

Ça fonctionne pour Googlebot, mais certains proxies et CDN ignorent les query strings pour la gestion du cache. Le hash dans le nom de fichier (app.abc123.js) est plus robuste et largement considéré comme la meilleure pratique.

Dois-je supprimer mes headers Cache-Control puisque Google les ignore ?

Non. Ces headers restent essentiels pour les navigateurs et les autres moteurs. Gardez-les configurés correctement pour l'expérience utilisateur, mais ne comptez pas sur eux pour contrôler le crawl de Google.

Comment savoir si Googlebot crawle une version obsolète de mon site ?

Utilisez l'outil d'inspection d'URL dans la Search Console et demandez un test en direct. Comparez le rendu et les ressources téléchargées avec votre version live. Un décalage indique un problème de cache.

Le versioning par hash impacte-t-il les Core Web Vitals ?

Indirectement oui : si Googlebot ne détecte pas vos optimisations de performance parce qu'il crawle une ancienne version, les améliorations de LCP ou CLS ne seront pas prises en compte avant plusieurs semaines. Le versioning accélère la reconnaissance des changements.

🎥 From the same video 12

Other SEO insights extracted from this same Google Search Central video · duration 34 min · published on 27/05/2020

🎥 Watch the full video on YouTube →