Official statement
Other statements from this video 28 ▾
- 1:02 Does Google really render all JavaScript pages, regardless of their architecture?
- 1:02 Does Google really render ALL JavaScript, even without initial server-side content?
- 2:05 How can you ensure that Googlebot is truly crawling your site?
- 2:05 How can you ensure that Googlebot is genuinely Googlebot and not an imposter?
- 2:36 Does Google really limit CPU time during JavaScript rendering?
- 2:36 Is it true that Google actually limits CPU time during JavaScript rendering?
- 3:09 Should we stop optimizing for bots and focus solely on the user?
- 5:17 Does the CSS content-visibility property really affect rendering in Google?
- 8:53 How can you measure Core Web Vitals on Firefox and Safari without native API support?
- 11:00 How long does Google really wait before giving up on JavaScript rendering?
- 11:00 How long does Googlebot really wait for JavaScript rendering?
- 20:07 Why does Google display empty pages even when your JavaScript site is working perfectly?
- 20:07 Does AJAX really work for SEO, or should you think twice before using it?
- 21:10 Can blocking JavaScript really stop Google from indexing all the content on your pages?
- 24:48 Has dynamic prerendering become a trap for indexing?
- 26:25 Could your deleted resources be harming your pre-render indexing?
- 26:47 What does Google really do with your initial HTML before JavaScript rendering?
- 27:28 Is it true that Google really analyzes everything in the initial HTML before rendering?
- 27:59 Is it true that Google ignores JavaScript rendering if your noindex tag appears in the initial HTML?
- 27:59 Could a 404 page with JavaScript lead to the complete deindexing of your site?
- 28:30 Why does Google refuse to render JavaScript if the initial HTML contains a meta noindex?
- 30:01 Does Google really catch duplicate content after JavaScript rendering?
- 31:36 Are GET APIs really cached by Google just like any other resource?
- 31:36 Does Google really ignore POST requests during JavaScript rendering?
- 34:47 Does Google really index all pages after JavaScript rendering?
- 35:19 Does Google really render 100% of JavaScript pages before indexing?
- 36:51 How do your failing APIs sabotage your Google indexing?
- 37:12 Are structured data on noindexed pages really lost to Google?
Google does not simply rely on the initial HTML to decide which version of a page to index. The engine generates hashes of the HTML before and after JavaScript rendering, then compares them. If these hashes differ, the signals from the rendering take precedence in the canonicalization process. In practical terms, your canonical tags injected via JS can therefore override those in the raw HTML.
What you need to understand
What is a content hash and why does Google use it?
A content hash is a unique digital fingerprint generated from the HTML code of a page. Google calculates this signature on the initial HTML (the one served by the server) and on the rendered HTML (after client-side JavaScript execution). If the two hashes are identical, the engine considers that the rendering does not add anything new.
But as soon as the hashes differ, Google knows that JavaScript has significantly altered the DOM. At this point, the engine switches and uses the signals from the rendered HTML to decide which URL to canonicalize. This statement confirms that rendering is not just a cosmetic step — it is a referee in deduplication.
Why does the distinction between initial HTML and rendered HTML change everything?
For years, SEO practitioners have been advised to place canonical tags in the initial HTML to avoid relying on JS execution. This advice remains valid for performance, but this statement seriously nuances the picture. If your framework (React, Vue, Angular) injects or modifies a canonical tag after rendering, Google may very well take it into account.
Let's be honest: this flexibility opens the door to mistakes. A canonical tag present in the initial HTML can be overwritten by JS, and if Google crawls the rendered version, it’s this second tag that prevails. The result: canonicalized URLs that go against your intentions.
When does rendering really influence canonicalization?
Typically, Single Page Applications (SPAs) and headless sites are primarily affected. These architectures often serve a skeleton HTML, then build the entire content on the client side. If your textual content, meta tags, or canonicals only exist after rendering, Google has no choice but to wait for JS execution to calculate the final hash.
Multi-variant e-commerce sites are also affected. Imagine a product page with color variants managed in JS: if each color modifies the URL and the visible content, the hashes will diverge. Google will then have to decide which version to index based on the rendering, not on the raw HTML which remains identical for all variants.
- Content hashes allow Google to detect if the JS rendering has substantially altered the DOM.
- When the hashes differ, the signals from the rendered HTML (canonical, hreflang, structured data) take precedence over those from the initial HTML.
- Modern JS frameworks (React, Vue, Next.js) can inject or modify tags after rendering, directly influencing canonicalization.
- SPA and headless sites are particularly exposed since their content often only exists after JS execution.
- A canonical tag in the initial HTML can be overwritten by a tag injected via JS if Google crawls the rendered version.
SEO Expert opinion
Is this statement consistent with real-world observations?
Yes and no. It has long been observed that Google indexes content generated by JavaScript, so the idea that it compares initial and rendered HTML is not a revelation. However, the official confirmation of the hashing mechanism brings welcome clarity. Until now, it was assumed that Google could 'see' the rendered output, without knowing exactly how it arbitrated between the two versions.
On the other hand, Martin Splitt remains vague on timing. How long does Google wait before considering rendering as stable? What crawl budget is allocated to rendering versus initial HTML? These questions remain unanswered. [To verify]: it is impossible to know if Google systematically recalculates hashes with every crawl or if it caches certain fingerprints to save resources.
What risks does this hashing logic pose for JS-heavy sites?
The main danger is the inconsistency between intentions and reality. You think you've canonicalized a URL in the initial HTML, but a third-party script (misconfigured tag manager, A/B test running in the background) alters the DOM afterwards. Google calculates a new hash, and boom: your canonical changes without your consent.
Another trap: sites that load different content based on geolocation or device. If the initial HTML is identical but the JS injects device-specific content, Google will detect a hash divergence. It may then canonicalize the mobile version when you intended to prioritize the desktop, or vice versa. This is particularly insidious on AMP sites where rendering can vary greatly.
Should we give up the initial HTML in favor of rendering for canonicalization?
No, and in fact, it’s the opposite. This statement does not say 'trust the JS'; it says 'Google considers rendering when necessary'. The best practice remains to serve all critical signals in the initial HTML: canonical, hreflang, structured data, textual content. This way, there’s no latency related to rendering, no risk of JS failure, and no dependence on the crawl budget of the Caffeine rendering queue.
But — and here's where it gets tricky — if your architecture does not allow it (headless commerce, pure SPA), this statement at least gives you the assurance that Google can read your post-render signals. It’s a safety net, not an excuse to neglect the initial HTML. [To verify]: no public data confirms that Google systematically crawls all pages in rendered mode, especially on large sites with a limited crawl budget.
Practical impact and recommendations
What should you prioritize checking on your own site?
Start by audiiting the canonical tags present in the initial HTML versus those present after rendering. Use Chrome DevTools or Screaming Frog in JavaScript mode to compare. If you notice divergences, identify which script is causing them (often a tag manager, a JS framework, or a poorly configured WordPress plugin). Correct at the source: the canonical should be consistent on both sides.
Next, examine high-traffic or strategic pages (key product sheets, SEO landing pages). Inspect them in Search Console with the 'URL Inspection' tool: Google shows you the version it indexed. If the displayed content differs from the initial HTML, it means rendering has taken precedence. Then check that the signals from the rendering are indeed what you want to convey.
How to prevent JavaScript from sabotaging canonicalization?
Golden rule: never inject or modify a canonical tag in JavaScript unless absolutely necessary. If your site is an SPA and you have no choice, ensure that server-side rendering (SSR or static pre-rendering) generates the tags before sending them to the client. Next.js, Nuxt.js, and others do this natively — leverage them.
For WordPress sites or traditional CMSs, disable plugins that manipulate the DOM afterwards to 'optimize' the canonicals. Some automated SEO tools add or modify these tags via JS, thinking they are doing the right thing, while they create a hash divergence. Always prefer a server-side modification (theme files, PHP hooks).
What tools to use to monitor HTML initial/rendered divergences?
When crawling, Screaming Frog in JavaScript mode enabled allows you to compare the two states. Configure a crawl without JS, then a second one with JS, and export the canonicals from both. Any discrepancies = red alert. OnCrawl and Botify offer similar features, with visual dashboards that facilitate spotting.
For ongoing monitoring, Google Search Console remains your best ally. The 'Coverage' tab and the 'URL Inspection' tool show you what Googlebot actually saw. If strategic pages are excluded or indexed with an unexpected canonical, it’s often an indication that rendering has taken precedence. Cross-reference this data with server logs (crawl budget, user-agent Googlebot) to get a complete picture.
- Compare canonical tags in the initial HTML and after rendering with Chrome DevTools or Screaming Frog
- Inspect strategic pages in Search Console (URL inspection tab) to verify the version indexed by Google
- Eliminate third-party scripts (tag managers, plugins) that modify the DOM and can cause hash divergences
- Prioritize server-side rendering (SSR) or static pre-rendering for SPAs to serve critical signals in the initial HTML
- Set up ongoing monitoring (Search Console + regular crawls) to detect canonicalization anomalies
- Document the technical architecture (JS frameworks, rendering method) to anticipate impacts on canonicalization
❓ Frequently Asked Questions
Google recalcule-t-il les hash à chaque crawl ou met-il en cache certaines empreintes ?
Une balise canonical ajoutée en JavaScript après le chargement initial sera-t-elle prise en compte par Google ?
Si le HTML initial et le HTML rendu ont le même hash, Google ignore-t-il complètement le rendu ?
Les hreflang et structured data injectés en JavaScript sont-ils concernés par ce mécanisme de hash ?
Comment savoir si Google a indexé la version HTML initial ou rendu de ma page ?
🎥 From the same video 28
Other SEO insights extracted from this same Google Search Central video · duration 46 min · published on 25/11/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.