Does crawl budget really consider JavaScript and XHR resources?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Crawl budget also applies to all resources loaded during page rendering, including JavaScript files and XHR requests.

26:40

🎥 Source video

Extracted from a Google Search Central video

⏱ 38:32 💬 EN 📅 10/05/2019 ✂ 8 statements

Watch on YouTube (26:40) →

✂ Other statements from this video 7 ▾

2:09 Googlebot utilise-t-il vraiment Chrome stable pour le rendu JavaScript ?
4:12 Googlebot suit-il vraiment la version la plus récente de Chrome pour le rendu ?
4:45 Faut-il encore adapter son JavaScript pour être crawlé par Google ?
19:15 Faut-il vraiment abandonner le dynamic rendering pour du SSR ?
24:30 Le lazy loading au scroll bloque-t-il vraiment l'indexation de votre contenu par Googlebot ?
28:24 Googlebot ignore-t-il vraiment tous les cookies entre ses requêtes ?
31:12 Googlebot refuse-t-il les permissions API : quelles conséquences pour l'exploration de votre site ?

📅

Official statement from May 10, 2019 (6 years ago)

⚠ A more recent statement exists on this topic Does JavaScript rendering really consume crawl budget? Martin Splitt · May 12, 2020 View statement →

TL;DR

Google confirms that crawl budget isn't limited to HTML URLs: it includes all resources needed to render the page, particularly JavaScript files and XHR requests triggered during execution. Specifically, a site that loads 50 JS files and 20 API requests per page consumes 70+ crawl slots per visit—a huge drain for large catalogs. The question remains: does Google allocate a distinct budget for these resources, or is everything mixed in the same pot?

What you need to understand

Does crawl budget really cover every network request of a page?

Yes. Google doesn't just fetch the initial HTML. When Googlebot renders a page, it loads and executes JavaScript scripts, then triggers necessary XHR/Fetch requests to build the final DOM. Each of these operations consumes a fraction of the site's allotted budget.

This is a paradigm shift for many SEOs used to thinking in terms of "number of URLs crawled." With modern architectures (React, Vue, Angular), a single URL can trigger 40-80 network requests—JS files, CSS, polyfills, third-party APIs, fonts, analytics. Every request matters. If your site has 100k pages and each page triggers 60 requests, you're asking Googlebot to handle 6 million resources.

What exactly does Google mean by 'page rendering'?

Rendering is the phase where Googlebot executes JavaScript to obtain the final indexable DOM. Unlike simple HTML crawl (fetching raw source), rendering mobilizes a headless browser (Chromium) that interprets scripts, applies DOM transformations, and triggers network calls.

This step is resource-intensive for CPU and network. Google must maintain server farms with Chromium to render millions of pages. Hence the idea of a budget: Google can't render everything in real-time, especially for sites that generate tons of asynchronous requests. If your site loads 15 external JS files on each visit, Googlebot has to download, parse, and execute them—all of which counts towards the time it's willing to spend on your domain.

Why does this statement change the game for JavaScript-heavy sites?

Because until now, many devs and SEOs thought that only 'real' URLs (those returning distinct HTML) counted towards the budget. However, a modern SPA can have 10 URLs but 500 critical resources for rendering. If Google crawls these 10 pages and needs to load 500 resources in total, you're consuming far more than just 10 slots.

Let's be honest: no one knows if Google allocates a separate budget for resources (JS, XHR) or if everything is counted in a common pool. But the practical impact is clear—sites that load hundreds of requests per page risk exhausting their budget before having all their strategic URLs crawled. And that's where the problem lies.

Crawl budget is no longer measured in pages but in total network requests (HTML + JS + XHR + critical assets).
SPA or CSR (Client-Side Rendering) sites consume significantly more budget than SSR (Server-Side Rendering) sites with the same number of URLs.
Google must allocate CPU time to execute JavaScript—which explains why rendering can be delayed for several hours or days after the initial crawl.
XHR requests triggered at runtime (for example, a product API loading variants) also count— even if they don't generate a distinct URL.
Optimizing crawl budget now means reducing the total number of network requests per page, not just the number of pages.

SEO Expert opinion

Is this statement consistent with on-the-ground observations?

Yes, generally. SEOs managing large e-commerce sites or content portals have observed for years that JS-heavy sites crawl slower at equal budget levels. Googlebot logs show it spends more time per page on a poorly optimized React site than on a classic SSR site. Google's statement confirms what we suspected: every network request counts.

But—and this is where it gets tricky—Google gives no figures. We don't know if an XHR request 'costs' as much as an HTML page, nor if external JS files hosted on CDNs (e.g., analytics, fonts) are counted in the same way. [To be verified]: Does Google allocate a separate pool for third-party resources? Total mystery. Practically, many sites find that blocking analytics scripts or third-party pixels via robots.txt improves crawl speed—which suggests that yes, these requests count too.

What gray areas does Google not clarify?

First point: what is truly 'critical' for rendering? Google says it crawls the resources needed for rendering, but how does it determine what's necessary? If your site loads 50 scripts, but only 10 are blocking for the first paint, do the other 40 still count? The answer is unclear. In theory, Googlebot should ignore non-critical scripts—in practice, it often loads them anyway.

Second point: rendering timing. Google has confirmed that crawling and rendering are two distinct phases. HTML is crawled immediately, rendering may occur several days later. So is your crawl budget consumed at the moment of HTML crawl, or at the time of rendering? If it's at rendering time, that means a site can be crawled quickly but rendered slowly—which could delay effective indexing. [To be verified]: no official data on this.

Third point: network errors and timeouts. If Googlebot tries to load a JS file that timeouts or returns a 500, does this request still count against the budget? Logically yes, as Googlebot spent time waiting. But again, Google doesn't explicitly say. What’s certain is that sites with fragile external dependencies (unstable third-party APIs) risk wasting their budget on failing requests.

In what cases does this rule not really apply?

There are situations where crawl budget is simply not an issue. If you're managing a site of 50 pages, even with 100 requests per page, Google will crawl everything in a few minutes. Budget becomes critical only for large sites (>10k URLs) or sites with high velocity (new content published daily).

Another case: sites with solid SSR. If your HTML already contains all indexable content and JavaScript is just there to enhance UX (progressive enhancement), then JS rendering becomes optional for Google. It can index your content right from the HTML crawl, without going through the costly rendering phase. That’s why SSR remains the safest approach for SEO-critical sites—fewer requests, less risk, faster crawl.

Warning: If your site generates millions of parameterized URLs (e.g., e-commerce facet filters) and each URL triggers 50+ JS requests, you are probably saturating your budget without realizing it. Check your Googlebot logs: if the average time between crawls increases, that's a red flag.

Practical impact and recommendations

What concrete steps should be taken to optimize crawl budget for JS?

First step: audit the total number of requests per page. Use Chrome DevTools (Network tab), run a Lighthouse audit, or analyze your server logs. Count how many JS, CSS, XHR, fonts, images are loaded per page. If you exceed 40-50 requests per page, you have a performance AND crawl budget issue. Start by identifying non-critical scripts (analytics, chatbots, marketing pixels) and load them deferred or via a consent manager.

Second step: bundle and minify assets. Instead of serving 15 distinct JS files, bundle them into 2-3 critical files. Use code-splitting to load only the JS necessary for each page. Modern frameworks (Webpack, Vite, Next.js) make this easier. Fewer requests = less load on Googlebot = more pages crawled per visit. Simple but effective.

Third step: switch to SSR or pre-rendering when possible. If your site generates dynamic content on the client side (via XHR), consider whether this content can be generated server-side. An SSR-rendered product catalog will be crawled 10x faster than a catalog loaded via client-side API. Yes, it's more complex to implement—but for a 50k URL e-commerce site, it's the difference between a complete crawl in 1 week or 1 month.

What mistakes should be absolutely avoided?

Mistake #1: blocking JS files via robots.txt. Many SEOs still think that blocking JavaScript saves budget. This is false—and even counterproductive. If you block critical JS, Googlebot can't render the page properly, so it indexes a broken or empty version. Google has been saying for years: never block resources necessary for rendering. However, blocking non-critical third-party scripts (analytics, ads) can help—but do it judiciously.

Mistake #2: multiplying unnecessary API calls. If each page loads an API to display 'similar products' or 'recommended articles', and this data isn’t critical for SEO, load them lazily or on the client-side after indexing. Googlebot doesn’t need to crawl 20 XHR requests to index a product sheet—it needs the title, description, price, availability. The rest is UX bonus, not indexable content.

Mistake #3: ignoring error requests. If your logs show that Googlebot tries to load JS files that return 404 or 500, fix them immediately. Each failed attempt consumes budget for nothing. Clean up your references to old scripts, ensure all critical assets are accessible, and monitor network errors in Search Console.

How can I check that my site isn't wasting its crawl budget?

Use the Search Console section ‘Crawl Stats’. Look at the number of requests per day, the average download time, and the size of responses. If the average download time increases, it means Googlebot is spending more time per page—often due to heavy JS resources. Compare this figure with the number of URLs crawled: if you have 100k URLs but Google only crawls 500 pages/day, you have a budget issue.

Also analyze the raw server logs. Look for patterns: how many requests does Googlebot make per visit? On which URLs? How much time between visits? If you see Googlebot crawling your product pages once a month while you're adding 100 products a week, it's a signal that your budget is saturated. In this case, prioritize strategic sections via the sitemap, block unnecessary URLs (filters, infinite paginated pages), and optimize response time.

Reduce the number of network requests per page to less than 30-40 (HTML + JS + XHR + critical assets).
Bundle and minify JavaScript files to limit distinct HTTP requests.
Implement SSR or pre-rendering for strategic pages (catalogs, product sheets).
Load non-critical third-party scripts (analytics, chat, pixels) deferred or via consent manager.
Monitor Googlebot logs for error requests (404, 500, timeout) and correct them.
Use Search Console to track changes in average download time and adjust accordingly.

Optimizing crawl budget on a modern JavaScript site requires a sharp technical approach: network audits, bundle refactoring, balancing SSR and CSR, log monitoring. If your team lacks the resources or expertise to carry out these optimizations, considering the support of a specialized SEO agency can accelerate gains—especially for e-commerce or editorial sites with high volume, where each day of indexing delay represents measurable lost revenue.

❓ Frequently Asked Questions

Les fichiers JavaScript hébergés sur CDN tiers comptent-ils aussi dans le budget de crawl ?

Oui, toute requête réseau nécessaire au rendu compte, y compris celles vers des CDN externes. Si Googlebot doit charger jQuery depuis un CDN Google ou des fonts depuis Google Fonts pour rendre la page, ces requêtes consomment du budget. C'est pour ça que limiter les dépendances tierces améliore souvent la vitesse de crawl.

Le budget de crawl est-il le même pour le crawl HTML et le rendu JavaScript ?

Google ne le précise pas officiellement. En pratique, le crawl HTML et le rendu JS semblent partager le même pool global de ressources allouées au site. Mais comme le rendu est fait par vagues (parfois plusieurs jours après le crawl), on peut supposer qu'il existe une file d'attente distincte — sans savoir si elle a son propre budget ou non.

Si je bloque les requêtes XHR non critiques via robots.txt, est-ce que je gagne du budget ?

Peut-être, mais c'est risqué. Bloquer des XHR critiques empêcherait Googlebot de rendre la page correctement. Par contre, bloquer des endpoints API qui servent uniquement des données UX (recommandations, commentaires chargés en lazy) peut effectivement réduire la charge. Teste d'abord l'impact sur le rendu dans la Search Console (test URL en direct).

Un site SSR consomme-t-il vraiment moins de budget qu'un site CSR à contenu égal ?

Oui, nettement. Avec SSR, Googlebot récupère le HTML complet dès le crawl initial — pas besoin de phase de rendu JS coûteuse. Un site CSR impose à Googlebot de charger, parser et exécuter les scripts, puis de déclencher les XHR. À volume égal, le site SSR sera crawlé 3-5x plus vite, voire plus sur de gros catalogues.

Comment savoir si mon site sature son budget de crawl à cause du JavaScript ?

Regarde la Search Console : si le temps de téléchargement moyen par page augmente, ou si le nombre de pages crawlées par jour stagne malgré un gros volume d'URLs, c'est un signal. Vérifie aussi tes logs serveur : si Googlebot fait 50+ requêtes par visite de page, tu as probablement un problème de budget JS. Un audit réseau Chrome DevTools te donnera le nombre exact de requêtes par page.

🏷 Related Topics

crawl budget rendu JavaScript Googlebot XHR SSR CSR optimisation crawl indexation

Domain Age & History Crawl & Indexing JavaScript & Technical SEO PDF & Files

🎥 From the same video 7

Other SEO insights extracted from this same Google Search Central video · duration 38 min · published on 10/05/2019

🎥 Watch the full video on YouTube →

Related statements

« Previous

Scroll-Based Lazy Loading Impact...

Browser API Permissions Are Denied...

« Back to results