Does Googlebot really stop at 15 MB per URL?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

By default, Googlebot retrieves 15 megabytes of raw content per URL, then stops. This limit applies per URL: if your HTML references other resources, each of them has its own 15 MB limit.

🎥 Source video

Extracted from a Google Search Central video

💬 EN 📅 30/03/2026 ✂ 44 statements

Watch on YouTube →

✂ Other statements from this video 43 ▾

📅

Official statement from March 30, 2026 (1 month ago)

⚠ A more recent statement exists on this topic Should you really avoid using unique canonicals on multi-page e-commerce sites? John Mueller · March 31, 2026 View statement →

TL;DR

Googlebot crawls up to 15 megabytes of raw content per URL by default before stopping. Each referenced resource (CSS, JS, images) has its own 15 MB limit. This technical constraint directly impacts the indexation of large or content-rich pages.

What you need to understand

What exactly is this 15 MB limit?

Google sets a crawl limit of 15 megabytes for each crawled URL. In concrete terms, Googlebot downloads the raw content of a page until it reaches this threshold, then stops downloading abruptly.

This limit applies to the main HTML document only. External resources — CSS, JavaScript, images, videos — referenced in this HTML each benefit from their own 15 MB quota. In other words, a 10 MB HTML page that loads a 12 MB script and an 8 MB stylesheet passes without issue.

Why does Google impose this constraint?

The reason is simple: protecting crawl infrastructure. Google processes billions of URLs daily. Without safeguards, a poorly configured site could send documents of several hundred megabytes, saturating the crawler's resources.

This limit also prevents abuse — intentional or not. Endlessly generated dynamic pages, oversized JSON feeds, log files exposed by mistake: as many cases that would exceed crawl budget without adding value.

What happens if my page exceeds 15 MB?

Googlebot cuts off the retrieval at exactly 15 MB. Content beyond this threshold is never seen or indexed. If your crucial text appears after 16 MB of HTML, it will remain invisible to Google.

No alert is sent to Search Console. No notification, no warning. The crawl stops silently, and you discover the problem when your pages don't rank or display truncated snippets.

Limit of 15 MB per URL, applied to raw content (HTML, JSON, XML...)
Each referenced resource (CSS, JS, images) has its own 15 MB limit
Content beyond 15 MB is never crawled or indexed
No warning in Search Console in case of exceeding the limit
This rule aims to protect Google's infrastructure and prevent abuse

SEO Expert opinion

Does this 15 MB limit pose a problem in practice?

Let's be honest: the majority of websites don't even come close to this threshold. A typical HTML page weighs between 50 KB and 500 KB. Even a long-form article with embedded rich media rarely exceeds 2-3 MB of pure HTML.

Where does it get tricky? E-commerce sites with oversized product listings loaded on a single page, SaaS platforms that inject massive JSON datasets into the DOM, or news sites that stack dozens of articles on the same URL with infinite scroll. In these cases, 15 MB can disappear quickly.

Is Google transparent about what counts toward these 15 MB?

Martin Splitt mentions "raw content," which includes uncompressed HTML. But what about inline JavaScript? JSON data embedded in <script> tags? SVGs integrated directly into the markup?

[To verify] Google doesn't specify whether HTTP compression (gzip, Brotli) is accounted for before or after this limit. If Googlebot decompresses first, a 3 MB compressed HTML could weigh 12 MB raw and approach the limit. No official data on this.

Should you actively monitor this metric?

Yes, especially if you operate in verticals with high content volume. E-commerce, price comparison sites, directories, job boards: so many sectors where pages easily balloon.

The problem? No Google tool exists to monitor this threshold. You must manually measure the HTML size returned for your main templates. A simple curl -I with Content-Length isn't always enough — some servers don't return this header or serve compressed content.

Caution: if you use client-side JavaScript rendering to load additional content after the initial HTML, this content is NOT counted toward the initial 15 MB. But it will be crawled during rendering, with its own limits (timeout, JS resources...).

Practical impact and recommendations

How do you verify if your pages exceed the limit?

Start by identifying your at-risk templates: category pages, listings, archives, internal search results pages. These are the ones that accumulate the most content.

Then, measure the actual size of the returned HTML. Use curl -s https://yoursite.com/page | wc -c to get the size in bytes of the raw content. If you exceed 10-12 MB, you're in dangerous territory.

What optimizations should you implement if you exceed the limit?

First approach: pagination or lazy loading. Instead of loading 500 products on a single page, split into pages of 50 products. Or implement infinite scroll that loads content via AJAX after initial render.

Second lever: clean up superfluous HTML. Debug comments, unnecessary whitespace, redundant JSON-LD, oversized data-* attributes... all of this adds weight. Minify and compress aggressively.

Finally, if you embed JSON datasets for your React/Vue apps, externalize them. Rather than inlining them in the HTML, load them via a dedicated endpoint. This lightens the main document and respects the per-resource limit.

What if you can't reduce the size?

If your content is legitimate and can't be split — for example, exhaustive technical documentation on a single page — you'll have to accept that Google only crawls part of it. In this case, ensure your crucial content appears within the first 10 MB.

Alternatively, restructure your architecture so each major section is a distinct URL. This also improves your internal linking and the granularity of your indexation.

Audit templates with high content volume (categories, listings, archives)
Measure actual HTML size with curl or monitoring tools
Implement pagination or lazy loading for lengthy content
Clean up HTML: minification, comment removal, data attribute optimization
Externalize large JSON datasets instead of inlining them in HTML
Prioritize important content in the first megabytes of the document
Regularly monitor page size after each deployment

The 15 MB limit per URL remains theoretical for most sites, but becomes critical for content-rich platforms. Well-designed architecture and regular monitoring are usually sufficient to avoid the problem. If your site handles substantial data volumes or if you notice inconsistent indexation despite optimizations, consulting with a specialized SEO agency can help you thoroughly diagnose these technical issues and structure a sustainable solution tailored to your context.

❓ Frequently Asked Questions

Les 15 Mo incluent-ils le contenu compressé ou décompressé ?

Google parle de "contenu brut", ce qui suggère le HTML décompressé. Mais aucune confirmation officielle n'existe sur le moment où cette limite s'applique dans le pipeline de crawl.

Si mon HTML fait 14 Mo et charge un CSS de 20 Mo, que se passe-t-il ?

Le HTML passe sans problème car il est sous la limite. Le CSS sera crawlé jusqu'à 15 Mo puis tronqué. Chaque ressource a son propre quota indépendant.

Puis-je voir dans Search Console si mes pages dépassent 15 Mo ?

Non. Google n'affiche aucune alerte ni métrique sur cette limite. Tu dois mesurer manuellement la taille de tes documents HTML.

Le contenu chargé en JavaScript après le rendu initial compte-t-il dans ces 15 Mo ?

Non. La limite de 15 Mo s'applique au document HTML initial retourné par le serveur. Le contenu hydraté côté client a ses propres contraintes (timeout de rendering, budget JS).

Cette limite a-t-elle toujours existé ou est-ce une nouveauté ?

Google a toujours eu des limites de crawl, mais la communication explicite sur ce seuil de 15 Mo est relativement récente. La limite elle-même était probablement en place depuis longtemps sans être documentée publiquement.

🏷 Related Topics

crawl budget Googlebot indexation limite crawl optimisation HTML taille page performance SEO architecture site

Content Crawl & Indexing Domain Name

🎥 From the same video 43

Other SEO insights extracted from this same Google Search Central video · published on 30/03/2026

🎥 Watch the full video on YouTube →

Related statements

« Previous

Network compression does not solve local storage...

Googlebot crawl limit: 15 MB per URL...

« Back to results