What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Google combines hreflang annotations from HTML, sitemaps, and HTTP headers. If you have hreflangs in the HTML and others in the sitemap, Google will try to combine and add them together.
9:02
🎥 Source video

Extracted from a Google Search Central video

⏱ 56:22 💬 EN 📅 27/11/2020 ✂ 23 statements
Watch on YouTube (9:02) →
Other statements from this video 22
  1. 1:37 Faut-il vraiment arrêter d'utiliser l'outil d'inspection d'URL pour indexer vos pages ?
  2. 1:37 La qualité globale du site influence-t-elle vraiment la fréquence de crawl ?
  3. 2:22 Faut-il vraiment arrêter d'utiliser l'outil d'inspection d'URL pour indexer vos pages ?
  4. 9:02 Peut-on vraiment cibler plusieurs pays avec une seule page hreflang ?
  5. 10:10 Que se passe-t-il quand vos balises hreflang se contredisent entre HTML et sitemap ?
  6. 11:07 Faut-il utiliser rel=canonical entre plusieurs sites d'un même réseau pour éviter la dilution du signal ?
  7. 13:12 Les liens entre sites d'un même réseau sont-ils vraiment traités comme des liens normaux par Google ?
  8. 14:14 Les actions manuelles Google ciblent-elles vraiment un schéma global ou sanctionnent-elles aussi des cas isolés ?
  9. 16:54 La longueur de vos ancres impacte-t-elle vraiment votre référencement ?
  10. 18:10 Google réévalue-t-il vraiment les pages qui s'améliorent avec le temps ?
  11. 20:04 Les ancres de liens riches en mots-clés sont-elles vraiment un signal négatif pour Google ?
  12. 20:36 Google peut-il vraiment ignorer automatiquement vos liens sans vous prévenir ?
  13. 29:42 Google traduit-il votre contenu en anglais avant de l'indexer ?
  14. 30:44 Google traduit-il vos requêtes pour afficher du contenu en langue étrangère ?
  15. 32:00 Les avis clients anciens nuisent-ils au positionnement de vos fiches produit ?
  16. 33:21 Le volume de recherche sur votre marque booste-t-il vraiment votre SEO ?
  17. 34:34 Les iFrames sont-elles vraiment crawlées par Google ou faut-il les éviter en SEO ?
  18. 46:28 Comment vérifier si vos bannières cookies bloquent l'indexation Google ?
  19. 47:02 La page en cache reflète-t-elle vraiment ce que Google indexe ?
  20. 51:36 Comment gérer les multiples versions de documentation technique sans diluer votre SEO ?
  21. 54:12 Une action manuelle révoquée efface-t-elle vraiment toute trace de pénalité ?
  22. 54:46 Faut-il vraiment supprimer son fichier disavow ou risquer une action manuelle ?
📅
Official statement from (5 years ago)
TL;DR

Google automatically merges hreflang annotations from three distinct sources: the HTML of your pages, your XML sitemaps, and HTTP headers. This automatic combination may seem convenient, but it introduces the risk of conflicts and hard-to-diagnose errors. Essentially, if you have hreflangs scattered across multiple sources, Google will attempt to piece them together — with no guarantee of consistency.

What you need to understand

Why does Google combine hreflang instead of prioritizing a single source?

The answer lies in the tolerance philosophy of the engine. Google prefers to aggregate incomplete signals rather than simply ignoring valid tags scattered across different systems. In theory, this allows complex sites to use multiple methods without blocking each other.

In practice, this means that an hreflang in HTML can coexist with another in the sitemap, and Google will try to reconcile them. The engine will not choose one source over the other — it will attempt to merge all annotations it finds.

What are the three sources of hreflang recognized by Google?

The first source is the raw HTML of the page, via the <link rel="alternate" hreflang="x" href="..." /> tags in the <head>. This is the most common and visible method.

The second is XML sitemaps, where you can declare linguistic variants of a URL directly within the <xhtml:link> structure. Useful for large sites that generate their sitemaps automatically.

The third, lesser-known source, is the HTTP header Link:, especially for non-HTML files like PDFs. Google reads this header and interprets it as an hreflang signal just like the other two.

What happens in case of inconsistency between sources?

This is where it gets gray. Mueller does not specify how Google arbitrates when two sources provide contradictory indications. For example, if your HTML points to a different German version than is indicated in the sitemap.

Google will attempt to intelligently merge, but there is no guarantee that the result will match your intent. In some cases, the engine may ignore a signal it deems inconsistent. In others, it might generate a broken hreflang configuration that no longer respects the reciprocity rule.

  • Google reads all three sources: HTML, XML sitemap, HTTP headers
  • It merges annotations rather than prioritizing a single source
  • No official hierarchy is documented among these three methods
  • Conflicts between sources can create hard-to-detect errors
  • Reciprocity remains mandatory: each URL must point to the others, regardless of the source used

SEO Expert opinion

Is this statement consistent with real-world observations?

Yes, and that's precisely the problem. We regularly observe sites where hreflangs are split between HTML and sitemaps, without anyone noticing. Google does not always report these duplicates in Search Console, leading to the belief that everything is functioning correctly.

However, in reality, this merging can create hreflang loops or asymmetric relationships. A classic example: a FR page points to DE in HTML, but the sitemap adds a relationship to IT that the HTML ignores. The result? Google sees an incomplete chain and may reject the whole.

What nuances should be added to this claim?

Mueller does not say how Google prioritizes in case of direct conflict. If your HTML declares hreflang="de" href="/de-v1" and your sitemap declares hreflang="de" href="/de-v2", which one takes precedence? [To verify] — Google does not document this mechanism.

Moreover, this merging is presented as a tolerance advantage, but it encourages poor practices. Rather than centralizing hreflang management in a single reliable source, it invites dispersion which complicates debugging.

In what cases does this combination become a trap?

Let's take a concrete case: you are migrating your site and decide to move hreflangs from HTML to the sitemap to simplify maintenance. But you forget to remove the old HTML tags. Google will combine both, potentially creating relationships to dead or outdated URLs.

Another scenario: your CMS automatically injects hreflangs into HTML, while your dev team adds hreflangs into the sitemap without checking consistency. Google will merge both sets together, and you may end up with reciprocity errors that even Search Console may not always detect.

⚠️ Warning: Google's automatic merging does not exempt you from maintaining strict consistency across all your sources. An inconsistent hreflang, even if it's a minority, can corrupt the entire linguistic cluster.

Practical impact and recommendations

What practical steps can be taken to avoid conflicts?

The best approach is to choose a single source and stick to it. If you opt for the sitemap, do not place any hreflangs in the HTML. If you use HTML, do not duplicate annotations in the sitemap. This simple rule eliminates 90% of the risk of conflict.

If you absolutely must use multiple sources — for example, HTML for pages and HTTP headers for PDFs — document this division and regularly audit for consistency. An automated verification script can compare the three sources and alert in case of divergence.

How do you check that your hreflang is interpreted correctly by Google?

Search Console remains the reference tool, but it does not always report merging errors. You may see indexed pages without any hreflang errors displayed, while behind the scenes Google has ignored some conflicting signals.

A reliable test involves simulating a localized search and checking which language variant appears. You can also use a crawler like Screaming Frog or OnCrawl to simultaneously extract hreflangs from HTML, sitemaps, and HTTP headers, then compare all three sets in a spreadsheet.

What critical mistakes must absolutely be avoided?

Never leave orphaned hreflangs — that is, annotations pointing to URLs without reciprocity. Google may simply ignore them, even if they come from different sources.

Also avoid mixing URL formats: if your HTML uses absolute URLs with HTTPS and your sitemap uses relative URLs or HTTP, Google may not recognize them as the same pages. Strict normalization is essential.

  • Prioritize a single hreflang source (HTML, sitemap, or HTTP headers) for the entire site
  • If multiple sources are necessary, document the division and regularly audit for consistency
  • Verify full reciprocity among all language variants, regardless of the source
  • Use normalized absolute URLs (HTTPS protocol, consistent trailing slash) everywhere
  • Crawl the site to extract hreflang from all sources and compare them in a control table
  • Monitor Search Console, but do not rely on it blindly — manually test variants under real conditions
Google's automatic merging of hreflang is a technical tolerance, not an invitation to negligence. A solid hreflang strategy relies on a well-documented unique source, perfect reciprocity, and regular audits. These optimizations, especially for complex multilingual sites, can quickly become time-consuming and technical. If you manage several dozens of linguistic variants or an ecosystem of international subdomains, enlisting a specialized SEO agency will help secure your hreflang architecture and avoid costly visibility errors.

❓ Frequently Asked Questions

Google privilégie-t-il une source hreflang par rapport aux autres en cas de conflit ?
Google ne documente pas de hiérarchie officielle entre HTML, sitemap et en-têtes HTTP. En cas de conflit, le moteur tente de fusionner les signaux, mais le résultat peut être imprévisible et générer des erreurs de réciprocité.
Peut-on utiliser le sitemap pour certaines langues et le HTML pour d'autres ?
Techniquement oui, mais c'est risqué. Google combinera les deux sources, ce qui peut créer des incohérences difficiles à détecter. Il est préférable de choisir une seule méthode pour l'ensemble du site.
Les en-têtes HTTP hreflang sont-ils vraiment pris en compte pour les PDFs ?
Oui, c'est la seule méthode pour signaler les variantes linguistiques de fichiers non-HTML. Google lit l'en-tête Link: et le traite comme un hreflang classique, à condition que la réciprocité soit respectée.
La Search Console affiche-t-elle les erreurs de fusion entre plusieurs sources hreflang ?
Pas toujours. La Search Console détecte les erreurs de réciprocité classiques, mais elle ne signale pas systématiquement les conflits entre HTML, sitemap et en-têtes HTTP. Un audit manuel reste nécessaire.
Que se passe-t-il si on supprime les hreflang du HTML mais qu'on les laisse dans le sitemap ?
Google continuera de lire les hreflang du sitemap, à condition qu'ils soient cohérents et réciproques. C'est une migration valide, mais il faut vérifier que le sitemap est bien crawlé et que toutes les URLs y figurent.
🏷 Related Topics
Crawl & Indexing HTTPS & Security AI & SEO Search Console International SEO

🎥 From the same video 22

Other SEO insights extracted from this same Google Search Central video · duration 56 min · published on 27/11/2020

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.