Does Googlebot really ignore your multilingual site's accept-language header?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Googlebot almost never crawls with a defined accept-language header, or sometimes uses 'en' (English). If a site serves different content based on the user's accept-language header, Google will only see the English version (or the default version without a language). It is better to display a banner offering a language change rather than automatically switching content.

54:21

🎥 Source video

Extracted from a Google Search Central video

⏱ 59:11 💬 EN 📅 11/08/2020 ✂ 42 statements

Watch on YouTube (54:21) →

✂ Other statements from this video 41 ▾

📅

Official statement from August 11, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Should you configure the Content-Language header for PDFs and non-HTML files? John Mueller · April 25, 2024 View statement →

TL;DR

Googlebot almost always crawls without a defined accept-language header, or occasionally uses 'en' as the default. If your site automatically switches content based on this header, Google will only see the English version or the default version—never your other language variants. The official recommendation: manual language selection banner rather than automatic redirection.

What you need to understand

Why does Googlebot crawl without an accept-language header?

Googlebot does not behave like a standard browser. Unlike a user whose browser systematically sends an accept-language header reflecting their language preferences, Google’s bot mostly crawls without this HTTP header.

When this header is sent, it defaults to 'en' (English). This deliberate approach aims to ensure that Googlebot accesses a stable and predictable version of your pages, without any language bias introduced by the server.

What is the risk for a site that detects language via accept-language?

If your server detects the accept-language header and automatically serves different content based on language, you create a major indexing issue. Google will only see the version served by default (often English, sometimes the default language of your server configuration).

The other language versions remain invisible to crawling. Your French, Spanish, or Japanese site may technically exist, but Googlebot will never access it if access depends on a header it doesn’t send. This creates a complete blind spot in indexing.

How does Google actually distinguish language versions?

Google relies on explicit signals: distinct URLs by language (subdomains, subdirectories, URL parameters), hreflang tags declared in HTML or the sitemap, and visible content on the page.

The accept-language header never enters the equation. It is a volatile, client-side data point that Google deliberately ignores to prioritize structural signals that the webmaster can control.

Googlebot does not read accept-language in 99% of cases—it crawls “neutrally”
Switching content via this header makes your language versions invisible to indexing
Distinct URLs + hreflang are the only reliable signals for multilingualism
A manual language selection banner ensures that all versions remain accessible to crawl
The content served to Googlebot must be identical to what is served to a user without language preferences

SEO Expert opinion

Is this statement consistent with observed practices in the field?

Yes, and it serves as a brutal reminder for those who persist in believing that Google “understands everything.” In audits of multilingual sites, we regularly observe cases where only the English version (or the default server version) appears in the index. The rest? Never crawled, never indexed.

The confusion often arises from a misunderstanding of how HTTP works. Many frameworks (especially those focused on “user experience”) automatically detect accept-language and redirect the user. Convenient for UX, catastrophic for SEO if no distinct URL structure exists in parallel.

What nuances should be added to this official guideline?

Mueller talks about “almost never.” This “almost” leaves some margin for uncertainty—Googlebot can send 'en' in certain contexts. In practical terms? If your site switches from French to English as soon as it detects 'en', you lose the French version even in this minority scenario.

Another nuance: this rule applies to initial crawling and indexing. For JavaScript rendering or certain specific tests (Search Console, Mobile-Friendly Test), Google can simulate different contexts, but this is not the standard behavior of Googlebot in its classic indexing pipeline. [To be verified]: no official data on the exact frequency of sending the 'en' header.

In what cases could this rule still pose a problem?

If you use a CDN or reverse proxy that detects accept-language upstream of your server and serves differentiated cached versions, you are exposed. Even if you have distinct URLs, if the CDN overrides language detection, Googlebot may be stuck on a single variant.

Classic case: Cloudflare Workers or Lambda@Edge with poorly configured language routing rules. The bot arrives at /fr/, but the worker detects the absence of the accept-language header (or the value 'en') and serves the English version on the French URL. Result: duplicate content, chaotic indexing, loss of local relevance.

Attention: Headless content management systems (Contentful, Strapi, etc.) coupled with JavaScript frameworks (Next.js, Nuxt) often activate language detection via accept-language by default. Be sure that server-side rendering (SSR) or static generation (SSG) does not depend on this header to serve the final content.

Practical impact and recommendations

What should be done concretely for a multilingual site?

First step: distinct URLs by language. Subdomains (fr.example.com), subdirectories (example.com/fr/), or URL parameters (example.com?lang=fr)—the structure doesn’t matter, as long as it is stable and crawlable without a specific HTTP header.

Second step: correctly implement hreflang tags. Each page must declare its linguistic and regional variants, including self-references. Googlebot relies on these tags to understand the relationships between versions, independently of any HTTP header.

What mistakes should be absolutely avoided?

Never serve different content on the same URL based on accept-language. This is the classic trap of “smart” frameworks that detect the visitor's language. For Googlebot, the URL example.com/product must always return exactly the same content, regardless of the header.

Avoid 302 redirects based on accept-language as well. Googlebot will follow the redirect but will index the target (often the English version), leaving your other language versions orphaned. If you must redirect users, do it using client-side JavaScript after the initial HTML load—this way, Googlebot always sees the canonical version.

How can I check that my site complies with this logic?

Test your URLs with curl by explicitly removing the accept-language header: curl -H "Accept-Language:" https://example.com/fr/. The returned content should be identical to what is visible in a browser configured in French.

Also use Google Search Console, Coverage section, to check that all your language versions appear as indexed. If only the English version shows up, you likely have a server-side detection problem based on accept-language. Inspect the URL using the URL Testing Tool—the returned HTML should match the expected language, without depending on a header that Googlebot does not send.

Structure the site with distinct URLs by language (subdomain, subdirectory, or parameter)
Implement hreflang tags on all pages, including self-references
Never serve different content on the same URL based on accept-language
Offer a manual language selection banner instead of automatic redirection
Test server rendering with curl without the accept-language header to validate content stability
Check in Search Console that all language versions are crawled and indexed

Managing a multilingual site in accordance with Google’s expectations relies on an explicit URL architecture and clear technical signals (hreflang). Ignoring the accept-language header in your content delivery logic is the only guarantee that Googlebot will access all your variants. These configurations, especially on complex infrastructures (CDN, SSR, headless CMS), can quickly become technical. If you lack internal resources or want to secure a multilingual migration without the risk of partial indexing, consulting a specialized SEO agency will help you avoid costly mistakes and accelerate compliance.

❓ Frequently Asked Questions

Googlebot envoie-t-il parfois un header accept-language avec une autre valeur que 'en' ?

Non, selon Mueller, Googlebot n'envoie presque jamais ce header, et quand il le fait, c'est toujours avec la valeur 'en' (anglais). Aucune autre langue n'est utilisée dans ce contexte.

Mon site redirige automatiquement selon accept-language, est-ce que Google verra quand même toutes mes langues ?

Non. Si la redirection dépend du header accept-language, Googlebot verra uniquement la version par défaut ou anglaise. Les autres versions linguistiques resteront invisibles au crawl.

Peut-on détecter la langue de l'utilisateur en JavaScript côté client sans impacter l'indexation ?

Oui, tant que le HTML initial renvoyé au serveur reste stable et ne dépend pas du header accept-language. La redirection ou l'affichage d'une bannière peut se faire après le chargement, côté client.

Les balises hreflang suffisent-elles à compenser une détection serveur basée sur accept-language ?

Non. Hreflang indique les relations entre versions, mais si Googlebot ne peut pas crawler une version parce que le serveur la masque, hreflang ne sert à rien. Les URLs doivent être accessibles indépendamment du header.

Est-ce que cette règle s'applique aussi aux sites monopages (SPA) avec routing JavaScript ?

Oui, si le serveur renvoie un HTML initial différent selon accept-language. Sur un SPA, assurez-vous que le HTML de base (shell) est identique pour tous les visiteurs, quelle que soit leur langue, et que le contenu linguistique se charge ensuite via JavaScript ou SSR stable.

🏷 Related Topics

crawl multilingue hreflang indexation accept-language Googlebot architecture SEO international

Domain Age & History Content Crawl & Indexing AI & SEO Local Search International SEO

🎥 From the same video 41

Other SEO insights extracted from this same Google Search Central video · duration 59 min · published on 11/08/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Inconsistency Between Declared Language in hreflan...

Errors 405 and soft 404: equivalent long-term hand...

« Back to results