Why does Google ignore your canonical tags when the raw HTML contradicts the rendered output?

Official statement

Having a different canonical URL in the raw HTML and in the rendered HTML creates mixed signals for Google. This may lead Google to choose a completely different canonical or to alternate between the two versions, making reports in Search Console difficult to interpret.

🎥 Source video

Extracted from a Google Search Central video

💬 EN 📅 26/04/2021 ✂ 26 statements

Watch on YouTube →

✂ Other statements from this video 25 ▾

□ Does Google really experience delays in discovering JavaScript links?
□ Does a raw HTML noindex really prevent JavaScript rendering by Google?
□ Can you really modify title, meta, and links on the client side with JavaScript without risks?
□ Is client-side JavaScript really holding back your SEO performance?
□ Raw HTML vs Rendered: Does Google really not care?
□ Does Google AdSense really penalize your site's speed like any other third-party script?
□ Should you be worried about 'other error' issues with images in the Search Console?
□ Should you prioritize user agent or viewport detection for your separate mobile versions?
□ Do JavaScript navigation links really affect your site's SEO?
□ Can you really lose control of your canonical by leaving the href attribute empty at load time?
□ Does Google really use different crawlers for its SEO testing tools?
□ Are the structured data from your mobile version also applicable to desktop?
□ Should you really stop fearing JavaScript for SEO?
□ Do JavaScript links really slow down Google's discovery process?
□ How can a different canonical tag between raw HTML and rendered output destroy your canonicalization strategy?
□ Can you really remove a noindex via JavaScript without risking de-indexation?
□ Is it truly safe to modify meta tags and links with JavaScript without risking your SEO?
□ Do Google products really get a hidden SEO advantage in search results?
□ Should you be concerned about 'other' errors in the URL Inspection Tool?
□ Does Google really overlook your images during web search rendering?
□ User agent or viewport: Does Google really differentiate for mobile indexing?
□ Do JavaScript-generated links truly pass ranking signals like traditional HTML links?
□ Can an empty HTML canonical tag mistakenly force Google to auto-canonicalize your page?
□ Can the Mobile-Friendly Test really substitute the URL Inspection Tool for auditing mobile crawling?
□ Why does Google ignore your desktop structured data after switching to mobile-first indexing?

📅

Official statement from April 26, 2021 (5 years ago)

⚠ A more recent statement exists on this topic Why does Google index rendered HTML instead of source HTML? Martin Splitt · July 6, 2022 View statement →

TL;DR

When a canonical URL differs between raw HTML (server-side) and rendered HTML (after JavaScript), Google receives mixed signals. As a result, the engine may ignore both versions and choose a completely arbitrary canonical, or worse, alternate between the two URLs depending on crawls. In concrete terms, your Search Console reports become unusable and your authority consolidation efforts go up in smoke.

What you need to understand

Where does this confusion between raw HTML and rendered HTML come from? 

The raw HTML corresponds to the source code sent directly by the server when a browser (or Googlebot) makes an HTTP request. This is what you see if you view the source of a page via 'View Page Source' in Chrome.

The rendered HTML corresponds to the state of the DOM after executing the client-side JavaScript. If your framework (React, Vue, Angular, Next.js in partial CSR mode) injects, modifies, or replaces a <link rel="canonical"> tag via JS, Google first sees one version, then another after rendering indexation.

Why does this divergence pose a problem for Google? 

Google first crawls the raw HTML, then queues the page for JavaScript rendering. This process is neither instantaneous nor guaranteed: some pages may wait for days or even weeks before being rendered. In the meantime, Googlebot has already extracted signals from the raw HTML — including the canonical.

When the canonical changes after rendering, Google inherits two contradictory instructions for the same URL. The engine has no way of knowing which one is 'the right one' — which breaks the consolidation logic. Martin Splitt claims that in this case, Google may choose a third URL as canonical or toggle between the two versions depending on crawl cycles.

What does 'toggle between the two versions' actually mean? 

This means that during a crawl, Google retains the canonical from the raw HTML, then on a subsequent crawl (after rendering), switches to the canonical from the rendered HTML. This instability fragments ranking signals: backlinks, anchors, click history, everything gets diluted across multiple URLs.

In Search Console, you observe inconsistent coverage reports: a URL marked 'Duplicate – canonical URL different from that defined by the user', then reclassified as 'Indexed', then marked duplicate again. It's impossible to manage your indexing properly under these conditions.

Mixed signals = Google can't decide which URL to consolidate as the reference.
Canonical toggling = your GSC metrics become unusable for performance tracking.
Arbitrary choice = Google may select a URL that you've never defined as canonical, diluting your authority.
Crawl budget impact = the bot wastes time crawling and reprocessing unstable variants instead of discovering fresh content.

SEO Expert opinion

Is this statement consistent with field observations? 

Yes, and it confirms a phenomenon that many SEO practitioners mistakenly attributed to 'bugs' from Google. In reality, it's a poorly managed front-end architecture issue. Sites migrating to modern JS frameworks (Next.js, Nuxt, SvelteKit) without strict SSR often fall into this trap.

This behavior is particularly observed on e-commerce sites where the canonical is managed via a React component that loads after the initial HTML. As a result, Google first indexes the product page with an empty or generic canonical, then switches to the correct URL after rendering — but in the meantime, backlinks have landed on the wrong variant.

What nuances should be added to this claim? 

Martin Splitt mentions a 'completely different' choice of canonical by Google, but he doesn’t specify the exact criteria for this choice. [To be verified]: it can be assumed that Google uses other signals (sitemaps, majority internal links, external backlinks) to arbitrate, but no official confirmation on the exact weighting.

Another unclear point is the frequency of toggling. Does Google switch each crawl? Only during rendering cycles? Or randomly based on the load of the rendering servers? Again, no publicly available data. We are navigating in a dependent manner, in an empirical observation mode.

Attention: If your site uses JavaScript to dynamically modify canonicals based on user parameters (A/B testing, geolocation, personalization), you are likely in a gray area. Google can interpret these variations as mixed signals even if your intent is legitimate. Always check the final rendering via the URL inspection tool in GSC.

In what cases does this rule not strictly apply? 

If your site is 100% static (complete SSG, without client-side JS hydration modifying meta tags), you are safe. Raw HTML = rendered HTML, so no divergence. This is the case for well-configured Gatsby, Hugo, or Jekyll sites.

Also, if you are using strict SSR (Server-Side Rendering) where the server sends the final HTML directly with the correct canonical, and the client-side JS never touches this tag, you remain safe. But as soon as a third-party library (tracking, consent, CMP) injects or modifies tags in the <head>, the risk reappears.

Practical impact and recommendations

What should be done concretely to avoid this problem? 

First, audit your raw vs rendered HTML for all your strategic pages. Compare the server source code (curl or 'View Page Source') with the DOM state after full loading (DevTools → Elements). If the canonical tags differ, you are in the red zone.

Next, prioritize server-side rendering for critical tags: canonical, hreflang, meta robots, structured data. Never allow JavaScript to modify these elements after the first paint. If your framework requires it, configure strict SSR or SSG for indexable pages.

What errors should be absolutely avoided? 

Never inject a canonical tag via a useEffect in React, a mounted() in Vue, or a script that runs after the DOMContentLoaded. Google crawls the raw HTML as a priority — your JS may take seconds to execute, and in the meantime, the signal is already sent.

Another classic trap: headless CMS (Contentful, Strapi, Prismic) that generate canonicals client-side via asynchronous API requests. If the API takes 500 ms to respond, your initial HTML is missing a canonical, which then appears after rendering. Google sees two incompatible states.

How can I check if my site is compliant? 

Use the URL inspection tool in Search Console: compare the 'Raw HTML' tab with 'Screenshot' (which reflects the rendering). If the canonicals diverge, you have a problem. Do this check on 10-15 standard pages (home, categories, products, articles).

Complement this with a Screaming Frog crawl in JavaScript mode: compare the 'Canonical Link Element 1' (raw HTML) and 'Rendered Canonical' (after JS) columns. Any divergence = potential mixed signal. Prioritize fixing high-traffic organic pages and backlinks.

Audit raw vs rendered HTML on 15 strategic pages using the GSC inspection tool
Configure strict SSR for all canonical tags, hreflang, and meta robots
Never inject a canonical tag via client-side JavaScript (useEffect, mounted, etc.)
Crawl the site with Screaming Frog in JS rendering mode and compare raw vs rendered canonicals
Verify that headless CMS send canonicals in the initial HTML, not via asynchronous API requests
Monitor GSC coverage reports to spot canonical switches between two crawls

The divergence between raw and rendered HTML on canonical tags creates a chronic instability in indexing. Google no longer knows which URL to consolidate, which fragments your ranking signals and renders your GSC reports unusable. The only viable workaround: strict server-side rendering for all critical tags. If your current front-end architecture doesn’t allow this natively, a technical overhaul may be necessary. These projects affecting both infrastructure, code, and SEO are rarely trivial to manage internally — enlisting the help of a SEO agency specialized in JavaScript SEO and SSR can expedite compliance while securing the transition.

❓ Frequently Asked Questions

Peut-on forcer Google à ignorer la canonique du HTML rendu et ne considérer que celle du HTML brut ?

Non, Google n'offre aucun paramètre pour désactiver le rendu JavaScript ou privilégier exclusivement l'HTML brut. Le moteur traite les deux états et tente d'arbitrer. La seule solution est d'aligner les deux versions.

Si Google alterne entre deux canoniques, est-ce que je perds définitivement l'autorité de l'une des deux URL ?

Pas définitivement, mais l'autorité se dilue. Les backlinks pointant vers l'URL non retenue lors d'un crawl donné ne sont pas consolidés vers la canonique active à ce moment-là. À long terme, cela fragmente le PageRank et affaiblit le potentiel de ranking.

Les frameworks modernes comme Next.js 13+ (App Router) ou Remix règlent-ils ce problème nativement ?

Partiellement. Next.js App Router avec SSR activé génère bien le HTML côté serveur, mais si tu utilises des composants clients (`'use client'`) qui modifient le `<head>`, le risque persiste. Remix impose un SSR strict par défaut, ce qui limite les divergences, mais reste vigilant sur les librairies tierces.

Est-ce que les balises hreflang et meta robots sont aussi concernées par ce problème de signal mixte ?

Oui, absolument. Toute balise critique modifiée par JavaScript après le HTML initial crée une divergence. Google peut ignorer les hreflang injectées en JS ou interpréter un `noindex` ajouté après rendu comme un signal contradictoire avec l'indexation initiale.

Comment savoir si Google a choisi une canonique différente de celles que j'ai définies ?

Dans Search Console, regarde la colonne « URL canonique sélectionnée par Google » dans le rapport de couverture. Si elle diffère de ta balise canonique (HTML brut ou rendu), c'est que Google a arbitré autrement — souvent en se basant sur les sitemaps, liens internes ou backlinks.

🏷 Related Topics

canonique HTML rendu JavaScript SEO indexation SSR crawl Search Console signaux mixtes

Crawl & Indexing AI & SEO Images & Videos Domain Name Search Console

🎥 From the same video 25

Other SEO insights extracted from this same Google Search Central video · published on 26/04/2021

🎥 Watch the full video on YouTube →

Related statements

« Previous

Empty canonical tag filled via JavaScript: the ris...

Meta robots noindex in JavaScript: important modif...

« Back to results