Does Googlebot crawl your JavaScript links even before rendering the page?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Googlebot follows a series of actions, starting with fetching the raw HTML to detect the necessary CSS, JavaScript, and images. Links in the initial HTML are also extracted for crawling, even if rendering occurs later.

9:22

🎥 Source video

Extracted from a Google Search Central video

⏱ 16:39 💬 EN 📅 06/06/2019 ✂ 6 statements

Watch on YouTube (9:22) →

✂ Other statements from this video 5 ▾

📅

Official statement from June 6, 2019 (6 years ago)

⚠ A more recent statement exists on this topic Does Googlebot crawl and render JavaScript at the same frequency? Martin Splitt · January 27, 2021 View statement →

TL;DR

Google claims that Googlebot extracts links present in the initial HTML before even rendering JavaScript. In practical terms, hardcoded URLs in your source HTML are discovered instantly, while those injected by JS will have to wait in the rendering queue. For an SEO, this means that a critical link generated solely via JavaScript will always experience a delay in discovery, even if Google eventually renders it.

What you need to understand

What does this crawl/render sequence really mean?

Martin Splitt describes here the two-step pipeline of Googlebot: first the retrieval of raw HTML, then the deferred JavaScript rendering. This architecture is not new, but Google highlights here a crucial detail: links present in the initial HTML are extracted immediately, even before the rendering engine processes any script.

This means that a hardcoded link in your HTML source (<a href="...">) enters the crawl queue without delay. Conversely, a link generated by React, Vue, or any client-side JS framework will have to wait for Googlebot to allocate rendering resources to that URL — which can take anywhere from a few hours to several days depending on the crawl budget and perceived priority of the page.

Why does Google separate crawl and rendering?

JavaScript rendering is resource-intensive. Executing thousands of scripts per second for every crawled URL would multiply CPU and memory needs by a prohibitive factor. Thus, Google optimizes its infrastructure by separating rapid collection (raw HTML, resource detection) from the heavy processing (JS execution, final DOM construction).

This architecture explains why a site entirely in JS without SSR (Server-Side Rendering) or pre-rendering often suffers from a indexing delay: Googlebot discovers the page, queues it, and then has to come back later with a rendering budget to extract the actual content. Meanwhile, your competitors with static HTML are already indexed.

What resources does Googlebot detect in the initial HTML?

Martin Splitt mentions CSS, JavaScript, and images. Googlebot parses the raw HTML to identify the <link>, <script>, and <img> tags and their src, href attributes. These resources are added to the download queue for later full rendering.

But be careful: an image or a script dynamically loaded via JS (document.createElement('img')) will only be detected after rendering. If your LCP (Largest Contentful Paint) depends on a resource absent from the initial HTML, Googlebot will not see it immediately — and your Core Web Vitals will also suffer on the user side.

Raw HTML links are crawled instantly, before any JavaScript rendering.
Critical resources (CSS, JS, images) must be present in the initial HTML to be detected quickly.
JS rendering is asynchronous: a variable delay separates the discovery of the page from its complete indexing.
Crawl budget is consumed twice: once for HTML, once for deferred rendering.
Full-JS sites without SSR/pre-rendering consistently suffer from indexing speed disadvantages.

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it's actually one of the few cases where Google clearly exposes a verifiable technical architecture. Field tests have confirmed for years that static HTML links appear in Search Console long before those generated by JavaScript. Crawls with a Googlebot desktop user-agent show raw HTML without JS execution — exactly what Splitt describes.

On the other hand, Google remains vague about the actual rendering delays. "Later" can mean 2 hours as well as 2 weeks depending on the site. Low-authority sites or those poorly optimized for crawl budget often find that some JS pages are never rendered. [To be verified]: Google provides no official metrics on the percentage of JS pages effectively rendered by Googlebot within a reasonable timeframe.

What nuances should be added in light of this rule?

This statement applies to the initial crawl, but does not cover all scenarios. For example, Googlebot can re-crawl a page that has already been rendered without necessarily re-rendering it — it retrieves the raw HTML, extracts new links, but may ignore changes in JS content if the page has not changed on the server side.

Another point: the lazy-loaded links via intersection observer or scroll events. Even with JS rendering, Googlebot does not automatically scroll to the bottom of the page. A link that only appears after user scrolling remains invisible to the bot, unless your implementation also loads it into the initial viewport or via a detectable mechanism. Google occasionally recognizes it, but remains vague about the depth of interaction simulation.

Warning: A site that deploys critical content solely via client-side JS without SSR or pre-rendering risks partial or delayed indexing. Testing tools (Search Console URL Inspection, Rendering Test) show the result after rendering — they do not reflect the actual delay or the priority that Googlebot assigns to this rendering in production.

In what cases does this rule not fully apply?

High-authority sites (Wikipedia, Amazon, major media) benefit from massive crawl budgets and almost instantaneous rendering. For them, the crawl/render separation is transparent. In contrast, a new or poorly linked site may see its JS pages queued for days or even ignored if Googlebot determines that rendering is not a priority.

Progressive Web Apps (PWA) with client-side routing also pose a problem: if your navigation relies on history.pushState without real server URLs, Googlebot will never discover these "pages" via the initial HTML. You must then provide an XML sitemap or SSR to expose all routes.

Practical impact and recommendations

What concrete steps should you take to optimize the discovery of your links?

Place your critical links in the initial HTML. If your main navigation, pagination, or strategic internal links are generated by React/Vue/Angular, implement SSR (Next.js, Nuxt.js, Angular Universal) or pre-rendering (Prerender.io, Rendertron). This ensures that Googlebot extracts these links right away during crawling, without waiting for rendering.

For e-commerce or high-volume editorial sites, this architectural choice makes the difference between smooth indexing and a constant bottleneck. Deep pages (categories, product sheets, old articles) must be accessible via static HTML links to avoid relying on the goodwill of deferred rendering.

What mistakes should be avoided when implementing JavaScript?

Never rely on JS rendering to expose pagination or category links. A site that displays "Load more" via JS without an alternative <a href> link condemns its deep pages to invisibility. Googlebot does not click buttons.

Avoid also blocking CSS or JavaScript in robots.txt "to save crawl budget". Google needs these resources for rendering — blocking them delays or prevents indexing of JS content. Allow Googlebot to access all files necessary for rendering, and instead optimize the server response speed and bundle sizes.

How can you check if your site complies with these principles?

Use the Inspect URL tool in Search Console and compare the "crawled" version (raw HTML) to the "rendered" version. If critical links only appear in the rendered version, they are experiencing the rendering delay. Crawl your site with Screaming Frog in "text only" mode (without JS): all links missing from this crawl are invisible on Googlebot's first pass.

Also audit your Core Web Vitals: an LCP that depends on an image loaded in JS penalizes both user experience and Googlebot rendering. Inject critical resources into the initial HTML via <link rel="preload"> or full SSR.

Implement SSR or pre-rendering to expose navigation and internal links in raw HTML
Check with Screaming Frog (JS-off mode) that all critical links are present
Compare "crawled" vs "rendered" in Search Console to detect discrepancies
Replace "Load more" buttons with traditional HTML pagination links
Never block CSS/JS in robots.txt if your content depends on it
Preload critical resources (LCP images, fonts, CSS) in the initial HTML

Optimizing JavaScript rendering for Googlebot requires solid technical expertise: choice of SSR framework, pre-rendering configuration, crawlability audit, crawled/rendered comparative tests. If your internal team lacks resources or experience on these topics, hiring a specialized SEO agency can expedite compliance and avoid costly indexing errors. Personalized support allows you to choose the solution best suited to your technical stack and business objectives.

❓ Frequently Asked Questions

Les liens générés par JavaScript sont-ils vraiment crawlés par Google ?

Oui, mais avec un délai. Googlebot extrait d'abord l'HTML brut, puis revient plus tard pour rendre le JavaScript et extraire les liens supplémentaires. Ce délai varie selon le crawl budget et la priorité du site.

Un site 100% JavaScript peut-il être correctement indexé ?

Techniquement oui, mais en pratique c'est risqué. Les sites full-JS sans SSR ni pre-rendering subissent des délais d'indexation importants et un risque d'indexation partielle, surtout sur les pages profondes ou les sites à faible autorité.

Faut-il bloquer les fichiers JavaScript dans robots.txt pour économiser le crawl budget ?

Non, c'est contre-productif. Googlebot a besoin d'accéder au CSS et JavaScript pour rendre correctement vos pages. Bloquer ces ressources empêche ou retarde l'indexation du contenu dynamique.

Comment savoir si mes liens JavaScript sont bien découverts par Googlebot ?

Utilisez l'outil Inspect URL de Search Console et comparez la version crawled (HTML brut) à la version rendered. Crawlez aussi votre site avec Screaming Frog en désactivant JavaScript pour voir ce que Googlebot détecte au premier passage.

Le SSR (Server-Side Rendering) est-il obligatoire pour un bon référencement JavaScript ?

Pas strictement obligatoire, mais fortement recommandé pour les sites exigeants en SEO (e-commerce, médias, marketplaces). Le SSR ou le pre-rendering garantit que les liens et le contenu critique sont présents dans l'HTML initial, éliminant le délai et le risque du rendu différé.

🏷 Related Topics

Googlebot JavaScript SEO rendu JS crawl budget indexation SSR HTML statique pre-rendering

Domain Age & History Crawl & Indexing E-commerce Featured Snippets & SERP AI & SEO Images & Videos JavaScript & Technical SEO Links & Backlinks

🎥 From the same video 5

Other SEO insights extracted from this same Google Search Central video · duration 16 min · published on 06/06/2019

🎥 Watch the full video on YouTube →

Related statements

« Previous

News Site Updates...

Sitemaps and Crawling Responsiveness...

« Back to results