What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

For identical subdomains in different languages, use hreflang tags to indicate the appropriate regional version to avoid duplicate content.
55:44
🎥 Source video

Extracted from a Google Search Central video

⏱ 59:22 💬 EN 📅 09/02/2017 ✂ 13 statements
Watch on YouTube (55:44) →
Other statements from this video 12
  1. 12:12 Les backlinks pointant vers une page AMP bénéficient-ils vraiment à la version HTML canonique ?
  2. 17:46 Les textes en pied de page nuisent-ils vraiment au référencement de votre site ?
  3. 18:30 Combien de temps faut-il vraiment pour qu'un changement de métadonnées impacte vos positions ?
  4. 21:11 Googlebot indexe-t-il vraiment les images en lazy loading ?
  5. 25:45 Les pop-ups intrusifs détruisent-ils vraiment votre SEO ?
  6. 27:25 Les menus burger pénalisent-ils vraiment le référencement de vos liens internes ?
  7. 29:20 Le Data Highlighter vaut-il encore le coup face au JSON-LD ?
  8. 42:00 Pourquoi Google réécrit-il vos balises title et meta description sans vous demander votre avis ?
  9. 46:00 Le masquage de contenu en mobile est-il vraiment sans risque pour le SEO ?
  10. 53:02 Le code 503 est-il vraiment l'ami du SEO en cas de surcharge serveur ?
  11. 54:20 Les erreurs 410 nuisent-elles vraiment au référencement de votre site ?
  12. 57:30 Pourquoi diviser ou fusionner des domaines ralentit-il votre visibilité SEO ?
📅
Official statement from (9 years ago)
TL;DR

Google states that identical subdomains in different languages require hreflang tags to avoid duplicate content. This position may seem contradictory: if the content is strictly identical, hreflang does not resolve anything. The real question is whether Google considers multilingual content as duplicate content or merely as regional variations that need to be correctly flagged.

What you need to understand

Why does Google mention duplicate content for language versions?

Mueller's statement creates a conceptual confusion. Normally, identical content across multiple URLs constitutes classic duplicate content. Google must then decide which version to index and serve.

However, when discussing differing language versions, the content is actually not identical: a text in French and its translation in English are two distinct pieces of content. The term "identical subdomains in different languages" is ambiguous. Mueller likely refers to cases where the structure and template are identical, but the textual content differs by language.

How does hreflang fit into this issue?

The hreflang tags serve to indicate to Google that a page exists in several linguistic or regional variants. They do not strictly resolve a duplicate content issue. They convey: "This page in French and this page in English are regional equivalents, not redundant duplicates."

Without hreflang, Google may indeed treat two nearly identical pages (same structure, same images, only the text changes) as duplicate content. It will then canonicalize one at the expense of the other, sabotaging your multilingual strategy. Hreflang prevents this wild canonicalization by explicitly documenting the relationship between versions.

What happens if the content is truly identical in multiple languages?

If you publish the exact same text on fr.example.com and en.example.com, hreflang will not save you. Google will detect the duplicate and choose a canonical version. Hreflang only works if the content actually differs by language.

Some multilingual sites fall into this trap: they duplicate English content across multiple subdomains (UK, US, AU) without any real adaptation. In this case, hreflang is not the solution: content must be differentiated or explicitly canonicalized to a single version.

  • Hreflang is not an anti-duplicate shield: it documents legitimate linguistic variants.
  • Structurally identical pages that are translated are not considered duplicate content.
  • If the content is truly identical in multiple languages, explicitly canonicalize to a reference version.
  • Google can arbitrarily canonicalize if hreflang is missing or poorly implemented.
  • The confusion arises from the term "identical": Mueller speaks of identical structure, not identical text.

SEO Expert opinion

Is this statement consistent with real-world observations?

Yes, but it lacks precision. In practice, it is observed that Google handles multilingual sites differently based on the quality of their hreflang implementation. A site with well-configured hreflang sees all its language versions indexed and served correctly. Without hreflang, Google often arbitrarily chooses one version, usually the English one.

The issue is that Mueller uses the term "duplicate content" somewhat loosely. Technically, translated content is not duplicate. What hreflang resolves is the ambiguity for Google: without a clear signal, the search engine does not know if two similar pages are redundant duplicates or intentional regional variants.

When is hreflang not enough?

Hreflang does not compensate for a poorly designed multilingual architecture. If your URLs do not follow conventions (subdomains, subdirectories, or consistent parameters), hreflang will be ignored or misinterpreted. [To be verified] Google has never published statistics on hreflang error rates, but Search Console reports indicate that it is one of the most frequently broken annotations.

Another edge case: sites that serve nearly identical content across various regional versions (en-US, en-GB, en-AU) with only a few words modified. Hreflang will declare them as variants, but Google may still treat them as thin content or duplicate if differentiation is insufficient. Here, the real solution is to produce truly differentiated content by market or to canonicalize to a unique version.

What common mistakes invalidate hreflang?

The most frequent issue: non-reciprocal hreflang links. If the French page points to the English one in hreflang, but the English page does not point back to the French one, Google ignores the annotation. Another classic mistake: declaring URLs that return 404, 301, or do not themselves contain hreflang.

We also see sites mixing methods (HTTP headers, XML sitemaps, HTML tags) without consistency. Google prefers a single, clean implementation. Finally, using incorrect language codes ("en-uk" instead of "en-gb") ruins everything. These errors turn hreflang into noise, and Google then reverts to its default behavior: arbitrary canonicalization.

Practical impact and recommendations

What practical steps should you take for a multilingual site?

Start by auditing your architecture: subdomains (fr.site.com), subdirectories (site.com/fr/), or distinct domains (site.fr)? Each option has implications for hreflang. Subdomains and subdirectories are the easiest to manage.

Then, implement hreflang consistently on all equivalent pages. Each page must point to all its linguistic variants, including itself (self-reference). Preferably use HTML tags in the or a dedicated XML sitemap if you have thousands of pages. Ensure that each URL declared in hreflang is accessible, indexable, and contains hreflang itself.

How can you check if hreflang is functioning correctly?

Google Search Console displays hreflang errors in the "Coverage" and "Enhancements" sections. Track missing returns, orphan URLs, and invalid language codes. Use tools like Screaming Frog or Sitebulb to crawl your site and validate the reciprocity of annotations.

Also, test manually: search for a specific term on Google.fr and Google.com. You should see the appropriate language version based on location and language of the engine. If you see the wrong version, it means hreflang is not being considered or Google has selected a different canonical.

What mistakes should you absolutely avoid?

Never declare hreflang pointing to canonically defined pages elsewhere. If your French page has a canonical tag pointing to the English one, hreflang will be ignored: it’s contradictory. Canonical and hreflang are complementary, but canonical always takes precedence.

Avoid duplicating identical content across multiple language variants hoping that hreflang will mask the issue. Google will detect the duplicate and will canonicalize anyway. Hreflang is not a permission to duplicate. Finally, do not mix levels: if you are using subdomains for languages, don’t switch some languages to subdirectories. Architectural consistency is essential.

  • Audit the multilingual architecture (subdomains, subdirectories, distinct domains).
  • Implement hreflang on all equivalent pages with strict reciprocity.
  • Validate language codes (ISO 639-1 for language, ISO 3166-1 Alpha 2 for country).
  • Check in Search Console for hreflang errors and unintended canonicalizations.
  • Test manually on different local versions of Google.
  • Never mix canonical and hreflang in contradictory ways.
Hreflang is essential for a multilingual site, but its implementation is often underestimated. Incorrect language codes, missing reciprocity, conflicts with canonical: these technical errors sabotage international SEO. If your site targets multiple markets and you notice wild canonicalizations or poor traffic distribution by language, a thorough hreflang audit is necessary. These multilingual configurations can quickly become complex, especially on a large scale or with poorly configured CMS. Engaging a specialized SEO agency for international SEO helps to correctly structure the architecture, avoid technical pitfalls, and maximize visibility in each target market.

❓ Frequently Asked Questions

Hreflang empêche-t-il réellement le contenu dupliqué ?
Non, hreflang ne bloque pas le duplicate content. Il indique à Google que des pages similaires sont des variantes linguistiques intentionnelles, ce qui évite une canonisation arbitraire. Si le contenu est vraiment identique, Google peut quand même le traiter comme du duplicate.
Peut-on utiliser hreflang uniquement dans le sitemap XML ?
Oui, c'est une méthode valide et souvent plus simple pour les gros sites. Mais Google préfère les balises HTML dans le <head> quand c'est possible. Ne mélangez pas les deux méthodes : choisissez-en une et tenez-vous-y.
Que se passe-t-il si hreflang et canonical se contredisent ?
Canonical l'emporte toujours. Si une page française a une canonical vers l'anglaise, Google ignorera le hreflang pointant vers la française. Ces deux balises doivent être cohérentes.
Faut-il un hreflang pour chaque variante régionale d'une même langue ?
Oui, si le contenu diffère réellement entre en-US, en-GB, et en-AU. Si le contenu est strictement identique, mieux vaut canoniser vers une version unique et ne pas créer de variantes artificielles.
Comment Google choisit-il la version à afficher si hreflang manque ?
Sans hreflang, Google utilise des signaux indirects : géolocalisation du serveur, ccTLD, langue déclarée dans le HTML, et signaux utilisateur. Mais ces critères sont moins fiables et mènent souvent à des canonisations non voulues.
🏷 Related Topics
Content Crawl & Indexing AI & SEO JavaScript & Technical SEO Domain Name International SEO

🎥 From the same video 12

Other SEO insights extracted from this same Google Search Central video · duration 59 min · published on 09/02/2017

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.