What does Google say about SEO? /

Official statement

Google extracts URLs for crawling from both the initial HTML and the rendered HTML. Links that are only present in the initial HTML but absent from the rendered HTML may work, but consistency is preferred to avoid future issues.
22:39
🎥 Source video

Extracted from a Google Search Central video

⏱ 465h56 💬 EN 📅 24/03/2021 ✂ 13 statements
Watch on YouTube (22:39) →
Other statements from this video 12
  1. 10:15 Does Google really factor in consecutive loading times, or just the initial visit?
  2. 60:22 Is Server-Side Rendering really essential for SEO in 2025?
  3. 76:24 Does the JSON hydration at the bottom of the page harm SEO?
  4. 121:54 Has Googlebot really become foolproof when it comes to JavaScript?
  5. 152:49 How does switching to Evergreen Chrome revolutionize Google's page rendering?
  6. 183:08 Does Google really render ALL of your JavaScript pages?
  7. 196:12 Why does Google never click on your Load More buttons, and how can you avoid this?
  8. 226:28 Should you really hide cumulative content from infinite paginations from Google?
  9. 251:03 Can you really provide a different navigation experience to Google without risking a cloaking penalty?
  10. 271:04 Does Googlebot really click on the JavaScript buttons and links on your site?
  11. 303:17 Should you create a separate page for each day of a multi-day event or canonize to a single page?
  12. 402:37 Is it true that JavaScript is fully compatible with modern SEO?
📅
Official statement from (5 years ago)
TL;DR

Google crawls URLs from both the initial HTML and the rendered HTML after JavaScript execution. Links that are only visible in the raw source HTML but are dynamically removed may technically work, but create a risky inconsistency. Martin Splitt recommends avoiding this practice to prevent future issues, without specifying what those might be.

What you need to understand

What Does This Difference Between Initial HTML and Rendered HTML Really Mean? <\/h3>

The initial HTML <\/strong> refers to the raw source code returned by the server before any JavaScript execution. This is what you see in a simple curl <\/code> or in the browser's source view.<\/p>

The rendered HTML <\/strong>, on the other hand, corresponds to the final DOM after executing all client-side scripts. A link present in the initial HTML can disappear if a script removes it, hides it via display:none <\/code>, or modifies the DOM. Google crawls both states — but that doesn't mean they carry the same weight.<\/p>

Why Does Google Extract URLs from Both Versions? <\/h3>

Googlebot performs a two-phase crawl <\/strong>: first the raw HTML (quick, low-cost crawl), then the JavaScript rendering (slower, resource-intensive crawl). Links in the initial HTML are discovered immediately, while those added by JS require additional time.<\/p>

If a link is present in the initial HTML but disappears after rendering, Googlebot can technically detect it in the first phase. However, this inconsistency creates a semantic ambiguity <\/strong>: should the bot follow this link or not? The webmaster's intent is unclear.<\/p>

What Are the Concrete Risks of This Inconsistency? <\/h3>

Martin Splitt mentions "future problems" without elaborating — a typical Google communication tactic. In practice, this could lead to wasted crawl budget <\/strong> on URLs you don't want indexed, or conversely, hinder the quick discovery of strategic pages.<\/p>

Another risk: if Google detects suspicious patterns (dynamically hidden links, unintentional cloaking), it could trigger a manipulation signal <\/strong>. Nothing confirmed, but the history of algorithm updates shows that HTML/JS inconsistencies are closely monitored.<\/p>

  • Initial HTML: <\/strong> raw source code returned by the server before JavaScript execution <\/li>
  • Rendered HTML: <\/strong> final DOM after executing all client-side scripts <\/li>
  • Two-phase crawl: <\/strong> Googlebot first extracts links from the raw HTML, then those from the JS rendering with a delay <\/li>
  • Inconsistency risk: <\/strong> wasted crawl budget, potential manipulation signals, unclear intent <\/li>
  • Google recommendation: <\/strong> maintain strict consistency between the two versions to avoid unspecified problems <\/li><\/ul>

SEO Expert opinion

Is This Statement Consistent with Real-World Observations? <\/h3>

Yes and no. On paper, Google claims to extract URLs from both versions — this is confirmed by the URL Inspection Tool <\/strong> and server log audits. But stating that "links only present in the initial HTML may work" remains deliberately vague <\/strong>. Work how? Are they crawled with the same priority? Do they pass PageRank? No concrete data.<\/p>

In practice, links present only in the initial HTML are often crawled <\/strong> but may be ignored in internal link calculations if Google detects that they disappear after rendering. [To be verified] <\/strong> — no official documentation confirms the relative weight of these links in the ranking algorithm.<\/p>

What Use Cases Truly Cause Problems? <\/h3>

The classic scenario: a site with lazy-loaded navigation <\/strong> or a hamburger menu managed in JS that dynamically removes footer links present in the raw HTML. Another frequent case: pagination links <\/strong> generated server-side and then hidden by an infinite scroll script.<\/p>

These practices do not break indexing — Google eventually discovers the URLs — but create a discovery latency <\/strong> and a risk of misallocated crawl budget. On a site with 100,000 pages, this matters. Splitt recommends consistency without explaining the threshold of tolerance: how many inconsistent links before demotion? We do not know.<\/p>

Should You Recode Everything in Server-Side Rendering? <\/h3>

No, that would be an excessive interpretation. Google does not say "no using JavaScript for links," it says "be consistent." If your navigation is managed in JS, make sure that the final links in the rendered DOM match those in the initial HTML <\/strong> — or that the initial HTML only contains links you genuinely want to see crawled.<\/p>

The real advice: audit your strategic pages using the Mobile-Friendly Test <\/strong> (which shows the rendered HTML) and compare it with a curl <\/code> of the initial HTML. If you see differences in the main navigation links or critical internal linking, correct them.<\/p>

Caution: <\/strong> Modern JS frameworks (React, Vue, Next.js) sometimes generate temporary links in the initial HTML that are replaced after hydration. Ensure that the final URL matches; otherwise, you create unnecessary crawl duplicates.<\/div>

Practical impact and recommendations

How Can I Identify Inconsistencies Between Initial vs Rendered HTML on My Site? <\/h3>

Start with a manual audit <\/strong> of your critical templates: homepage, categories, product sheets. Use the curl -A "Googlebot" <\/code> command to retrieve the raw HTML, then compare it with the rendered HTML <\/strong> in Chrome DevTools (Elements tab after full load). <\/p>

Next, automate with Screaming Frog <\/strong> by activating the "Render JavaScript" mode: the tool will crawl both versions and flag any link differences. Export the reports and look for URLs present only in "HTML" but absent from "Rendered HTML" — these are your problem candidates.<\/p>

What Corrections Should I Prioritize? <\/h3>

First rule: if a link is in the initial HTML, it must remain in the final DOM <\/strong> — unless you have a legitimate reason to hide it (e.g., personalized content, A/B testing). In that case, it’s better not to include it at all in the raw HTML and generate it only client-side.<\/p>

Second action: harmonize your navigation menus <\/strong>. If your header is in JS, ensure that the same links exist in the initial HTML via a <noscript><\/code> tag or server-side rendering. Pagination links should follow the same logic: either fully server-side or fully client-side with prerendering.<\/p>

What If My JS Architecture Makes Consistency Challenging? <\/h3>

This is where many React or Vue sites without SSR struggle. The technical solution: implement Server-Side Rendering (SSR) <\/strong> or Static Site Generation (SSG) <\/strong> through Next.js, Nuxt, or Gatsby. But this requires heavy refactoring.<\/p>

Pragmatic alternative: use prerender.io <\/strong> or a similar service to serve a pre-rendered HTML version to Googlebot. Less elegant, but effective if you cannot touch the code. In any case, test with the Search Console and the URL Inspection Tool to ensure Googlebot sees the expected links.<\/p>

  • Audit critical templates with curl + Chrome DevTools to detect initial/render differences <\/li>
  • Use Screaming Frog in "Render JavaScript" mode to automate the detection of inconsistencies <\/li>
  • Harmonize navigation menus: either fully server-side or fully client-side with prerender <\/li>
  • Implement SSR/SSG if the current architecture generates too many differences <\/li>
  • Test each correction with the Search Console’s URL Inspection Tool <\/li>
  • Document intentionally hidden links (A/B tests, personalization) to avoid false positives <\/li><\/ul>
    In summary: Google's recommendation aims to avoid intent ambiguities. If a link is in the initial HTML, it must remain visible after rendering — otherwise, do not include it at all. These technical optimizations can quickly become complex, especially on modern JS architectures. If you lack internal resources or find the scale of corrections daunting, consulting a specialized SEO agency in crawling and indexing can save you valuable time and secure your implementation choices.<\/div>

❓ Frequently Asked Questions

Google suit-il vraiment les liens présents uniquement dans le HTML initial mais absents du rendu ?
Oui, Googlebot extrait les URLs des deux versions. Mais aucune donnée officielle ne précise si ces liens transmettent du PageRank ou sont crawlés avec la même priorité que les liens cohérents. La recommandation reste d'éviter cette incohérence.
Un lien masqué en display:none après chargement JS pose-t-il problème ?
Si le lien est dans le HTML initial puis masqué dynamiquement, Google le détecte techniquement mais cela crée une ambiguïté d'intention. Mieux vaut ne pas l'inclure du tout dans le HTML brut si vous ne voulez pas qu'il soit crawlé.
Comment vérifier que Googlebot voit les mêmes liens que moi après rendu JS ?
Utilisez l'URL Inspection Tool de la Search Console et cliquez sur "View Crawled Page" pour voir le HTML rendu par Googlebot. Comparez avec un curl du HTML initial pour détecter les différences.
Est-ce que les frameworks JS modernes (React, Next.js) créent automatiquement ce type d'incohérence ?
Pas systématiquement. Next.js en mode SSR ou SSG génère un HTML initial cohérent avec le rendu final. En revanche, une SPA React classique sans prerender peut créer des différences si les liens sont générés uniquement côté client.
Faut-il privilégier le server-side rendering pour éviter tout risque ?
Ce n'est pas obligatoire. L'essentiel est la cohérence : si vos liens finaux dans le DOM rendu correspondent à ceux du HTML initial, peu importe la méthode. Le SSR simplifie juste cette cohérence, surtout sur des sites complexes.

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.