What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

The limit of 50,000 URLs in a sitemap applies only to the main URL tags (loc tag), not to additional attributes like hreflang, images, or videos. There is also a file size limit. You can create multiple sitemaps and group them via a sitemap index.
26:59
🎥 Source video

Extracted from a Google Search Central video

⏱ 57:01 💬 EN 📅 13/05/2020 ✂ 22 statements
Watch on YouTube (26:59) →
Other statements from this video 21
  1. 1:43 Google réécrit-il vraiment vos meta descriptions si elles contiennent trop de mots-clés ?
  2. 4:20 Pourquoi modifier le code Analytics bloque-t-il la vérification Search Console ?
  3. 5:58 Pourquoi votre balisage hreflang ne fonctionne-t-il toujours pas malgré vos efforts ?
  4. 5:58 Faut-il privilégier hreflang langue seule ou langue+pays pour vos versions internationales ?
  5. 9:09 Hreflang n'influence pas l'indexation : pourquoi Google indexe une seule version mais affiche plusieurs URLs ?
  6. 12:32 Pourquoi votre site disparaît-il complètement de l'index Google et comment le récupérer ?
  7. 15:51 L'outil de paramètres URL consolide-t-il vraiment tous les signaux comme Google le prétend ?
  8. 19:03 Les core updates ne sanctionnent-elles vraiment aucune erreur technique ?
  9. 23:00 L'outil de contenu obsolète supprime-t-il vraiment l'indexation ou juste le snippet ?
  10. 23:56 Pourquoi la commande site: est-elle inutile pour diagnostiquer l'indexation ?
  11. 23:56 L'outil de suppression d'URL désindexe-t-il vraiment vos pages ?
  12. 30:10 BERT pénalise-t-il vraiment les sites qui perdent du trafic après sa mise en place ?
  13. 32:07 Google Images choisit-il vraiment la bonne image pour vos pages ?
  14. 33:50 Faut-il vraiment détailler ses anchor texts avec prix, avis et notes ?
  15. 35:26 Pourquoi votre site reste-t-il partiellement invisible si votre maillage interne n'est pas bidirectionnel ?
  16. 38:03 Pourquoi Google refuse-t-il d'indexer toutes vos pages et comment y remédier ?
  17. 40:12 L'anchor text interne répétitif est-il vraiment un problème pour Google ?
  18. 42:48 Les paramètres UTM créent-ils vraiment du contenu dupliqué indexé par Google ?
  19. 45:27 Le mixed content HTTPS/HTTP impacte-t-il vraiment le référencement Google ?
  20. 47:16 Le hreflang en HTML alourdit-il vraiment vos pages ou est-ce un mythe ?
  21. 53:53 Pourquoi les anciennes URLs restent-elles dans l'index après une redirection 301 ?
📅
Official statement from (5 years ago)
TL;DR

Google clarifies that the limit of 50,000 URLs in a sitemap applies only to the main <loc> tags, not to hreflang attributes, images, or videos. In practice, a sitemap can hold many more than 50,000 references in total if you include language variants and media. This clarification changes the game for multilingual sites or those rich in visual content, allowing them to optimize their crawl without unnecessarily multiplying sitemap files.

What you need to understand

What exactly do we mean by the limit of 50,000 URLs?

When talking about XML sitemaps, most SEOs have this famous limit in mind: a maximum of 50,000 URLs per file. Mueller points out here that this constraint applies only to the tags — in other words, the main URLs of your pages.

Additional attributes like hreflang, images, or videos do not count towards this ceiling. Thus, a sitemap can have a main URL with 10 hreflang variants, 5 images, and 2 videos: you are using 1 out of 50,000 spots, not 18.

Why does this confusion persist among practitioners?

Many CMSs and sitemap generators display alerts as soon as the total number of items approaches the limits. The result: we think we've hit the ceiling when we may only have 15,000 actual URLs with extended attributes.

The second limit — the file size limit (50 MB uncompressed) — often comes into play before the URL limit when attributes are multiplied. This is where it gets tricky: a sitemap with 40,000 URLs but dozens of hreflang per page can exceed the allowed size.

How can we manage multiple sitemaps without losing control?

Google recommends using a sitemap index to group multiple files. This modular approach allows for segmentation by content type (pages, images, videos), by language, or by site section.

The advantage? You isolate high-volume attribute content into dedicated sitemaps while maintaining a structure that is readable for your team and for Googlebot. There’s no penalty for using 10 sitemaps instead of one — it’s even recommended beyond a certain scale.

  • The 50,000 limit applies only to tags, not to hreflang attributes, images, or videos
  • The file size (50 MB uncompressed) may be reached before the URL limit on complex sites
  • A sitemap index allows for grouping multiple files without negative impact on crawling
  • Segmenting by content type or language improves maintainability and error tracking
  • CMSs sometimes generate misleading alerts based on total item count rather than on main URLs

SEO Expert opinion

Does this clarification really change field practices?

Honestly, yes — especially for multilingual sites or platforms with lots of media. Before this clarification, we saw teams multiplying sitemaps out of fear of exceeding the limit when they still had room on the tags.

The problem is that many tools do not clearly distinguish main URLs from attributes in their counters. The result: you optimize for a problem that doesn’t exist, fragmenting your sitemaps unnecessarily and complicating maintenance. [To check]: do some tools respect this distinction in their alerts?

What inconsistencies do we observe in Google's recommendations?

Mueller does not specify whether AMP variants or alternative canonicals count towards the limit. In practice, we find that Google treats rel="alternate" differently depending on their context — mobile, language, or format.

Another gray area: the size limit is never precisely documented. We talk about 50 MB uncompressed, but in what encoding? Strict UTF-8 or with tolerance for extended characters? These technical details matter when managing sitemaps of several tens of thousands of URLs with multilingual titles.

When does this rule not suffice?

If your sitemap contains 5,000 URLs, but each has 15 hreflang variants and 8 images, you are likely nearing the file size limit before the URL limit. This is where segmentation becomes mandatory, not optional.

Another problematic case: e-commerce websites with thousands of product variants. Even if you adhere to the 50,000 main URLs, the file's complexity can slow down processing on Google’s side. We have observed perfectly compliant sitemaps being partially crawled due to a structure that is too dense — Google does not officially say this, but the behavior is evident.

Warning: The 50,000 URL limit is a technical constraint, not a recommendation for optimal volume. On high-volume sites, segmenting from 20,000-30,000 URLs per sitemap improves crawl responsiveness and error detection.

Practical impact and recommendations

How to audit your sitemaps to check for real compliance?

First step: only count the tags, not the total number of XML elements. A basic Python script or a tool like Screaming Frog does this in 30 seconds. If you're under 50,000 , you're compliant — even if the file contains 200,000 lines in total.

Next, check the uncompressed file size. How much does your sitemap.xml weigh in plain text? If you approach 45-48 MB, it’s time to segment, even if you only have 10,000 URLs. The weight limit often arrives before the number when attributes are multiplied.

What errors should absolutely be avoided in restructuring?

Don’t create 50 sitemaps of 1,000 URLs each just because your tool suggests it. Google has no problem with a sitemap of 40,000 clean URLs as long as the size remains reasonable. Over-segmentation complicates maintenance without crawl benefits.

Another common pitfall: grouping URLs with very different update frequencies in the same sitemap. If you mix static content crawled once a month with product pages updated daily, you dilute the signal for Googlebot. Segment by business logic, not by arbitrary technical thresholds.

Should you reorganize your existing sitemaps right now?

Let’s be honest: if your current sitemaps are working and Google Search Console isn’t raising errors, there’s no need to overhaul everything tomorrow morning. This clarification mainly serves to avoid premature optimizations on ongoing projects.

However, if you’re launching a new multilingual site or migrating a complex architecture, integrate this logic from the design phase. Plan your sitemaps by content type (editorial pages/products/media) and by language if you exceed 10,000 total URLs.

  • Audit your sitemaps to count only the tags, not the total XML elements
  • Check the file size uncompressed: if you approach 45 MB, segment even under 50,000 URLs
  • Use a sitemap index to group files by content type or language
  • Don’t over-segment: a sitemap of 40,000 clean URLs is better than 40 sitemaps of 1,000 URLs
  • Segment by update frequency and business logic, not by arbitrary technical thresholds
  • Monitor errors in Google Search Console after any changes to sitemap structure
The 50,000 limit applies to main URLs, not to extended attributes. Audit your sitemaps by counting the real tags and keep an eye on file size. Segment by business logic when necessary, and group via a sitemap index. These technical optimizations require a fine analysis of your architecture and crawl volumes — if the complexity exceeds your internal resources, consulting a specialized SEO agency will save you time and avoid costly visibility errors.

❓ Frequently Asked Questions

Un sitemap avec 30 000 URLs et 150 000 balises hreflang est-il conforme ?
Oui, tant que la taille du fichier non compressé ne dépasse pas 50 Mo. Seules les 30 000 balises <loc> comptent dans la limite de 50 000 URLs.
Faut-il créer un sitemap séparé pour les images et vidéos ?
Ce n'est pas obligatoire, mais recommandé si vous avez beaucoup de médias. Cela facilite le suivi des erreurs et peut améliorer la vitesse de crawl des contenus visuels.
Combien de sitemaps peut-on déclarer dans un index sitemap ?
La limite est de 50 000 sitemaps par index. En pratique, même les très gros sites dépassent rarement quelques centaines de sitemaps.
La limite de 50 Mo s'applique-t-elle au fichier compressé en gzip ?
Non, elle s'applique au fichier non compressé. Vous pouvez servir un sitemap gzippé beaucoup plus lourd une fois décompressé, tant que le XML original reste sous 50 Mo.
Google crawle-t-il différemment un sitemap de 10 000 URLs vs un de 50 000 ?
Officiellement non, mais en pratique on observe parfois une meilleure réactivité sur des sitemaps segmentés. La densité et la fréquence de mise à jour comptent plus que la taille absolue.
🏷 Related Topics
Domain Age & History Crawl & Indexing AI & SEO Images & Videos Domain Name PDF & Files Search Console International SEO

🎥 From the same video 21

Other SEO insights extracted from this same Google Search Central video · duration 57 min · published on 13/05/2020

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.