Is it true that Google really analyzes everything in the initial HTML before rendering?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Google analyzes the initial HTML to extract links (to add to the crawl queue), detect HTTP errors, and read meta tags (canonical, description, robots). Canonicalization begins in the initial HTML but continues after rendering.

27:28

🎥 Source video

Extracted from a Google Search Central video

⏱ 46:02 💬 EN 📅 25/11/2020 ✂ 29 statements

Watch on YouTube (27:28) →

✂ Other statements from this video 28 ▾

📅

Official statement from November 25, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Should You Remove Links That Are Only Present in the Initial HTML? Martin Splitt · March 24, 2021 View statement →

TL;DR

Google extracts links, HTTP errors, and meta tags from the initial HTML, before even executing JavaScript. Canonicalization starts at this stage but is not fixed: it continues after rendering. In practice, what you place in the static HTML counts immediately for crawling and discovery, while canonicalization remains an evolving process over which you have only partial control.

What you need to understand

What’s the difference between initial HTML and HTML after rendering?

The initial HTML refers to the raw code sent by your server, before the browser (or Googlebot) executes any JavaScript. This is what you see in the "View Source" tab of your browser.

The HTML after rendering is the final result once JavaScript has modified the DOM: adding dynamic content, injecting links, changing meta tags. Google first crawls the initial HTML, then queues the page for JavaScript rendering—a process that can take seconds, hours, or even days depending on the crawl budget and the page's priority.

Why does Google analyze the initial HTML first?

Because it's immediate and resource-efficient. Google doesn't have to mobilize Chromium to read a raw HTML file. This step quickly allows detection of HTTP errors (404, 500, redirects), extracting links to feed the crawl queue, and reading meta directives (canonical, robots, description).

If Google had to wait for JavaScript rendering to discover every new link, crawling would be catastrophically slow. Thus, analyzing the initial HTML serves as a first-instance filter: fast, effective, but partial. That’s why critical links must be in the static HTML, not injected in JS afterward.

Does canonicalization start in the initial HTML—but continue after?

Google reads your <link rel="canonical"> tag from the initial HTML and registers this directive. But it's just one signal among others: redirects, sitemaps, internal links, and even the declared canonical after JavaScript rendering can influence the final decision.

In other words, what you put in the initial HTML matters, but Google reserves the right to reevaluate after rendering. If your JavaScript modifies the canonical or adds client-side redirects, Google will consider that—but with a potential delay and no guarantee that this signal will prevail over others.

Google first crawls the initial HTML to extract links, HTTP errors, and meta tags
Links in static HTML are discovered immediately; those injected in JS may wait for days
Canonicalization starts with the canonical tag in the initial HTML, but is not fixed
Meta robots directives in the initial HTML are prioritized—no need to wait for rendering to block indexing
JavaScript rendering is a separate and slower process: don’t rely solely on it for critical signals

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it’s even a welcome confirmation. For years, we've observed that static HTML links are crawled faster than those injected via JavaScript. Tests with React or Vue.js sites consistently show a delay between initial crawl and post-render crawl—a delay that can range from a few hours to several weeks for low-priority pages.

What's more interesting is that Martin Splitt confirms that canonicalization is not binary. Many SEOs still think a canonical tag in initial HTML is definitive. However, Google reevaluates this directive after rendering and can even ignore it if other signals (redirects, internal links, sitemaps) point to a different URL. [To verify]: the exact priority order between initial HTML canonical, post-render canonical, and other signals remains unclear in this statement.

What nuances should be added to this claim?

The fact that Google reads meta tags in initial HTML doesn’t mean it systematically respects them. A noindex meta tag will generally be honored right from the initial HTML, but a meta description might be replaced by a dynamically generated snippet, even if you've set it in stone.

Another point: Google doesn’t say how long it takes for JavaScript rendering to be taken into consideration. If your canonical is in the initial HTML but your main content is injected via JS, you’re in a gray area—Google might index an empty shell while waiting to render the page, or it might wait for the complete render before indexing. Nothing is guaranteed.

In what cases can this rule cause issues?

If your site is a SPA (Single Page Application) and you're relying solely on JavaScript rendering to define your canonicals, you're taking a risk. Google may index an incomplete version of the page with a wrong canonical from the initial template, even before it has rendered the JS that injects the correct canonical.

Similarly, if you have intermittent HTTP errors (server temporarily returning a 500), Google might detect these in the initial HTML and decide not to queue the page for rendering—you then lose all the JS-injected content. That’s why a stable server is an absolute prerequisite for JavaScript-heavy sites.

If your canonical changes between initial HTML and JavaScript rendering, you’re sending conflicting signals to Google. When in doubt, Google will choose—and it won’t necessarily be your choice.

Practical impact and recommendations

What practical steps should you take to ensure Google reads your initial HTML correctly?

First priority: place your critical links in static HTML. Pagination, main navigation, links to strategic pages—everything that needs to be crawled quickly should not rely on JavaScript. Use a tool like Screaming Frog in "Text Only" mode to check what Google sees without rendering.

Next, ensure your canonical tags, meta robots, and meta descriptions are present from the initial HTML. If you’re using a JavaScript framework (Next.js, Nuxt, Gatsby), set up SSR (Server-Side Rendering) or SSG (Static Site Generation) so these tags are in the HTML returned by the server, not injected client-side.

What mistakes should you absolutely avoid?

Don’t rely on JavaScript to block indexing. If you want a noindex, put it in the initial HTML—ideally via an HTTP header X-Robots-Tag: noindex, which is read even earlier than the HTML. A noindex injected in JS may never be seen if Google doesn’t render the page.

Another classic error: defining a different canonical between initial HTML and JavaScript rendering. You might think that rendering takes precedence, but Google could well retain the first canonical it read or choose a third URL if the signals are too contradictory. Stay consistent.

How can you check if your configuration is correct?

Use Google Search Console, URL Inspection tab, and compare raw HTML ("More info" tab > "HTML returned") with rendered HTML ("Rendered page" tab). If your canonical tags, meta robots, or critical links differ in both versions, you have a problem.

Also test with curl or a tool like Postman: make a raw HTTP request to your page and check that essential tags are present. If they only appear in the browser, it means JavaScript injects them—and Google will see them later, if it ever sees them.

Place all critical navigation links in initial HTML, not in JavaScript
Define canonical, meta robots, and meta description as early as possible in static HTML (SSR/SSG)
Check with Screaming Frog in "Text Only" mode what Google sees without rendering
Use the URL Inspection tool from Search Console to compare initial HTML and rendered HTML
Avoid conflicting canonicals between initial HTML and JavaScript
Prefer HTTP headers X-Robots-Tag for critical directives (noindex, nofollow)

In summary: what you put in the initial HTML is read immediately by Google, while JavaScript rendering may take days. If a signal is critical (canonical, noindex, links to new pages), it must be in static HTML. JavaScript can complete, but not replace. These technical optimizations—SSR, SSG, managing canonicals, crawl architecture—often require sharp expertise and iterative testing. If you lack the internal resources to finely audit your initial HTML and validate that Google reads what you expect, working with a specialized SEO agency can save you months of trial and error and costly indexing mistakes.

❓ Frequently Asked Questions

Google lit-il la balise canonical dans le HTML initial ou après le rendu JavaScript ?

Google lit la canonical dès le HTML initial, mais continue d'évaluer après le rendu. Si la canonical change entre les deux versions, Google peut choisir l'une ou l'autre selon les autres signaux (redirections, liens internes, sitemaps).

Les liens injectés en JavaScript sont-ils découverts aussi vite que ceux dans le HTML statique ?

Non. Les liens dans le HTML initial sont ajoutés immédiatement à la file de crawl. Les liens injectés en JavaScript ne sont découverts qu'après le rendu, ce qui peut prendre des heures voire des jours selon le crawl budget.

Un meta robots noindex en JavaScript est-il pris en compte par Google ?

Oui, mais avec un délai. Si Google rend la page, il verra le noindex. Mais s'il ne la rend pas, ou s'il l'indexe avant le rendu, le noindex sera ignoré. Mieux vaut le placer dans le HTML initial ou en en-tête HTTP.

Comment savoir si Google a rendu ma page JavaScript ou s'il s'est arrêté au HTML initial ?

Utilisez l'outil "Inspection d'URL" dans la Google Search Console. Comparez l'HTML brut (onglet "HTML renvoyé") avec l'HTML rendu (onglet "Page rendue"). Si les deux sont identiques, votre page ne dépend pas du JavaScript — ou Google ne l'a pas encore rendue.

Google peut-il détecter une erreur 500 dans le HTML initial même si le contenu se charge ensuite en JavaScript ?

Oui. Si votre serveur renvoie un code 500 dans la réponse HTTP initiale, Google peut considérer la page comme en erreur et ne pas la mettre en file de rendu. Le contenu JavaScript ne sera jamais vu.

🏷 Related Topics

HTML initial canonicalisation crawl budget rendu JavaScript meta robots indexation SSR

Crawl & Indexing HTTPS & Security AI & SEO Links & Backlinks

🎥 From the same video 28

Other SEO insights extracted from this same Google Search Central video · duration 46 min · published on 25/11/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Duplicate Detection on Both Initial HTML and Rende...

Verification of the Origin of Googlebot Crawls...

« Back to results