Official statement
Other statements from this video 41 ▾
- 3:48 Does Google really automatically ignore irrelevant URL parameters?
- 3:48 Why does Google ignore certain URL parameters and how does it choose its canonical version?
- 4:34 Does Google really ignore non-essential URL parameters on your site?
- 8:48 Are errors 405 and soft 404 truly handled the same way by Google?
- 8:48 Do soft 404s really trigger deindexing without a penalty?
- 10:08 Should you really prefer a soft 404 over a 405 error for removed Flash content?
- 17:06 Does submitting multiple Google reconsideration requests really speed up the review of your site?
- 18:07 Do manual actions for unnatural outbound links really affect a site's ranking?
- 18:08 Do penalties on outbound links really impact your site's ranking?
- 18:08 Should you really set all your outbound links to nofollow to protect your SEO?
- 19:42 Should you really set all your outbound links to nofollow to protect your PageRank?
- 22:23 Does Google always show your images in search results?
- 22:23 How does Google decide which images to display in search results?
- 23:58 How long does it take to recover traffic after a 301 redirect bug?
- 23:58 Can temporary technical bugs really sink your Google ranking for good?
- 24:04 Can a bug restoring your old URLs kill your SEO?
- 24:08 Why does Google aggressively recrawl your site after a migration?
- 27:47 Should you index a new URL before redirecting an old one in a 301?
- 28:18 Is it really necessary to wait for indexing before redirecting a URL in 301?
- 34:02 Why does the mobile-friendly test produce conflicting results on the same page?
- 37:14 Why should WebPageTest be your go-to tool for web performance diagnostics?
- 37:54 Are H1 titles really essential for ranking your pages?
- 38:06 Are H1 and H2 tags really important for Google ranking?
- 39:58 Is it true that structured data makes a difference based on whether it's implemented with a plugin or manually?
- 39:58 Should you manually code your structured data or opt for a WordPress plugin?
- 41:04 Should you really be worried about a 503 error on your site for a few hours?
- 41:04 Can a 503 error truly harm your site's SEO?
- 43:15 Why are your FAQ rich snippets disappearing despite technically valid markup?
- 43:15 Why are your rich results disappearing from regular SERPs while they technically work?
- 43:15 Why do your rich snippets vanish even when your markup is technically correct?
- 47:02 Why does Search Console show indexed URLs that are missing from the sitemap?
- 48:04 Should you really modify the lastmod of the sitemap to speed up recrawling after fixing missing tags?
- 48:04 Should you modify the lastmod date in the sitemap after simply correcting a meta title or description?
- 50:43 Is it normal for the Rich Results report in Search Console to remain empty despite valid markup?
- 50:43 Why is Google showing fewer of your FAQs as rich results?
- 50:43 Is it true that your validated FAQ markup might be invisible in Search Console?
- 51:17 Why is Google showing fewer FAQs in rich results now?
- 54:21 Does Googlebot really ignore your multilingual site's accept-language header?
- 54:21 Can Google really tell the difference between your multilingual pages, or is it at risk of mistakenly canonicalizing them?
- 57:01 Is Google really tolerant of hreflang errors that mismatch language and content?
- 57:14 Does Googlebot really send an accept-language header during crawling?
Google asserts that if a Japanese page is assigned a Portuguese canonical, the issue lies in your server configuration (incorrectly set content negotiation) or inconsistent hreflang/canonical tags. The engine does not confuse two distinct language versions by itself. In practice: check that your server does not return varying language variants based on the Accept-Language header, and that your canonical tags point to the correct version.
What you need to understand
What does "accept-language based content negotiation" really mean?
Some servers analyze the HTTP Accept-Language header sent by the browser or Googlebot to decide which language version to serve. If this mechanism is poorly configured, Googlebot receives either the Japanese version or the Portuguese version on the same URL. The crawler then registers conflicting signals.
The problem becomes critical when the canonical tag of the Japanese page points to the Portuguese URL—or vice versa—because the server dynamically serves different content based on the context. Google indexes what it sees, and if what it sees changes with every crawl, the canonical floats between versions.
Why shouldn't Google confuse two distinct languages?
Google treats translated content as fundamentally different. Two pages in two languages are not duplicates in the classical sense: they target distinct audiences and queries. In theory, the engine should never consolidate a Japanese page and a Portuguese page under a single canonical.
If this occurs nonetheless, it is because the technical signals sent to the crawler are inconsistent. Either the server returns the same URL with variable content, or the hreflang/canonical tags are poorly implemented, or both. Google does not guess: it follows what you declare explicitly.
What configuration errors lead to this bug?
The most common cases include Apache or Nginx servers configured to serve dynamic content based on Accept-Language without a 302 redirect, or CMS platforms that generate canonical tags pointing to a "default language" regardless of the displayed version.
Another classic pitfall: maldefined cross hreflang tags. If the Japanese page declares an hreflang to Portuguese but lacks correct reciprocity, or if the canonical does not correspond to the self-declared URL, Google receives conflicting instructions and chooses arbitrarily.
- Poorly configured content negotiation: the server returns language variants on the same URL based on the HTTP Accept-Language header, without clear redirection.
- Inconsistent canonical tags: a Japanese page points its canonical to a Portuguese URL, or vice versa.
- Asymmetrical hreflang: hreflang annotations are not bidirectional, or point to URLs that do not mutually recognize each other.
- URLs without clear language markers: identical URL structures across versions (/page vs /page), making distinction impossible without content inspection.
- Non-transparent conditional redirects: 302 redirects based on Accept-Language that hide the true structure from the crawler.
SEO Expert opinion
Is this statement consistent with field observations?
Yes, but with a nuance: Google does not confuse two languages when technical signals are clean. However, on poorly configured sites, floating canonicals between language versions are regularly observed. Mueller points to the server and the tags—and he is correct 90% of the time.
The problem is that many CMS platforms generate these errors by default. Multilingual WordPress with WPML, Drupal with i18n, or poorly thought-out custom setups create asymmetrical hreflang or canonicals that consistently point to the "main" language. The SEO practitioner must manually audit.
What nuances should be added to this rule?
Mueller does not mention a frequent edge case: almost identical content between regional variants. A page in Brazilian Portuguese and a page in European Portuguese with 95% common text may be treated as near-duplicates if the hreflang tags are not impeccable. Google then chooses a "dominant" canonical based on other signals (links, engagement, etc.).
Another point: the statement assumes that the content is actually distinct. If you serve poorly translated machine Japanese with an identical HTML structure to the Portuguese version, Google might decide that one is a copy of the other, regardless of the displayed language. [To verify]: no public data specifies the similarity threshold at which Google transitions from a "language variant" logic to "duplicate".
In which cases does this rule not apply?
If your site uses subdomains or distinct domains by language (e.g., jp.example.com vs pt.example.com), the content negotiation problem disappears. Each subdomain serves unique content, and cross canonicals become impossible by construction. This is the most robust architecture to avoid this bug.
Conversely, if you use a single domain with URL parameters to switch the language (e.g., example.com/page?lang=ja), you are on treacherous ground. Google explicitly recommends avoiding this approach, as it makes hreflang fragile and canonicals ambiguous. In this case, Mueller's statement applies doubly.
Practical impact and recommendations
How can you check that your server configuration is not causing this bug?
Test manually with curl by modifying the Accept-Language header. If curl -H "Accept-Language: ja" https://example.com/page returns Japanese and curl -H "Accept-Language: pt" returns Portuguese on the same URL, you have a problem. Google will see different content with each crawl.
Use Google Search Console to inspect the URL: check that the crawled version corresponds to the expected language. If the tool shows either Japanese or Portuguese for the same URL intermittently, your server is negotiating content in an opaque manner.
What errors should be avoided in hreflang and canonical tags?
Each page must point its canonical to itself (self-referencing canonical) and declare bidirectional hreflang. If /ja/page points to /pt/page in canonical, it's a fatal error. If /ja/page declares hreflang="pt" to /pt/page, but /pt/page does not declare hreflang="ja" to /ja/page, Google ignores the annotations.
Avoid hreflang with URL parameters or URLs that change based on context. Prefer stable URL structures (/fr/, /en/, /ja/) or distinct subdomains. Canonicals should point to absolute URLs, never relative, to avoid ambiguity.
What concrete steps should be taken to correct this problem?
Disable accept-language based content negotiation if it is in place. Instead, redirect users according to their language via client-side JavaScript, or always serve the same language on a given URL and let the user switch manually.
Audit all your canonical tags with a crawler (Screaming Frog, OnCrawl): each language version must point to its own URL. Check that hreflang are symmetrical: each page cited in an hreflang must refer back to all other versions, including itself.
- Test URLs with curl and different Accept-Language headers to detect variable content on the same URL.
- Ensure that each page has a self-referencing canonical pointing to its own absolute URL.
- Audit hreflang tags to ensure they are bidirectional and complete (all language versions cited mutually).
- Disable server content negotiation if it generates dynamic content based on Accept-Language without explicit redirection.
- Inspect URLs in Google Search Console to confirm that the crawled version matches the expected language.
- Favor a clear URL architecture (/fr/, /en/, /ja/) or distinct subdomains by language.
❓ Frequently Asked Questions
Google peut-il vraiment confondre deux pages dans des langues totalement différentes ?
Qu'est-ce que la content negotiation basée sur Accept-Language ?
Les balises hreflang suffisent-elles à éviter ce problème ?
Comment savoir si mon site souffre de ce bug ?
Quelle architecture d'URL évite complètement ce risque ?
🎥 From the same video 41
Other SEO insights extracted from this same Google Search Central video · duration 59 min · published on 11/08/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.