Official statement
Other statements from this video 9 ▾
- 8:16 Ajouter ou supprimer des milliers de liens internes nuit-il vraiment au SEO ?
- 18:50 Google peut-il vraiment découvrir et indexer tous les liens JavaScript de votre site ?
- 28:51 Faut-il vraiment utiliser le fichier de désaveu en SEO ?
- 31:55 Peut-on vraiment déclarer des sitemaps multi-domaines via robots.txt ou faut-il passer par Search Console ?
- 46:17 Pourquoi Google réécrit-il vos balises title et comment reprendre le contrôle ?
- 47:04 Comment la balise canonical protège-t-elle réellement votre contenu syndiqué du duplicate content ?
- 48:19 AMP améliore-t-il vraiment le référencement de votre site ?
- 53:00 Le protocole HTTPS peut-il vraiment bloquer le crawl de Googlebot sur votre site ?
- 62:53 Comment Google utilise-t-il vraiment la localisation pour personnaliser les résultats de recherche ?
Google states that URLs in non-Latin languages, even when encoded and visually lengthy, do not negatively impact SEO as long as they remain understandable. This statement frees multilingual sites from the need to transliterate or artificially shorten their URLs. The real question is what Google exactly means by 'understandable' and whether this neutrality also applies to excessively long URLs in all situations.
What you need to understand
What does an 'encoded' URL actually mean?
When a URL contains non-ASCII characters (Cyrillic, Chinese, Arabic, Japanese, accents), browsers automatically convert them into percentages. A simple 'é' becomes %C3%A9, and a Russian word like 'страница' turns into a series of 24 encoded characters.
This technical transformation is invisible to the user typing or copying the URL but visible in the address bar. Google claims that this apparent length has no algorithmic impact.
Why is this clarification happening now?
For years, the SEO doctrine has advocated for short and readable URLs in Latin characters. This recommendation was relevant when search engines had technical indexing limits. Multilingual sites systematically transliterated their URLs to avoid encoding.
However, this practice creates a user experience problem. A Japanese user seeing '/seihin/denki' instead of '/製品/電気' loses immediate context. Google now recognizes that native readability takes precedence over technical length.
What does Google mean by 'understandable'?
This is the vague point of this statement. A URL is understandable if a human can identify the subject of the page by reading it. There’s no need to guess or rely on the website's context.
Acceptable example: example.com/категория/обувь (category/shoes in Russian). Problematic example: example.com/cat/prod/item12345/var789. The second is short but opaque, while the first is long when encoded but semantically clear.
- URLs in native languages do not harm SEO, even if encoding makes them visually lengthy
- Systematic transliteration is no longer an SEO requirement and can even degrade UX
- The criterion of 'understandability' remains subjective and depends on the linguistic context of the audience
- Google treats percent encoding as a transparent transformation with no algorithmic impact
- Multilingual sites can now prioritize native readability over technical length
SEO Expert opinion
Is this statement consistent with field observations?
Yes and no. Field tests confirm that URLs in non-Latin characters index correctly and perform well in local SERPs. A Russian site with Cyrillic URLs does not suffer visible penalties on Yandex or Google.ru.
But the reality is more nuanced. Very long URLs (beyond 120-150 real characters, not encoded) pose practical problems: truncation in social shares, difficult-to-analyze server logs, risks of 414 errors on some legacy servers. While Google may not penalize, the technical ecosystem isn’t always compatible.
What are the unspoken limits of this statement?
Google remains vague about the exact definition of 'understandable'. Is a 12-character compound word in German understandable? A Japanese URL mixing kanji and hiragana? The vagueness leaves room for algorithmic interpretation.
Another point: this statement concerns URLs 'in different languages', not URLs packed with parameters. A 300-character URL with 15 query strings remains problematic, even if it's technically not 'penalized'. The distinction between semantic length and technical length is not clarified. [To verify]: Does Google apply the same treatment to long URLs in Latin characters as it does to encoded ones?
In what cases does this recommendation not apply?
If your audience is multilingual and geographically dispersed, a URL in native characters might create memorization and cross-cultural sharing challenges. A Cyrillic link shared on an English-speaking forum becomes opaque.
E-commerce sites with shared catalogs across countries should keep URLs in Latin characters for technical management and analytics tracking ease. Google's statement does not consider these operational trade-offs.
Practical impact and recommendations
What should you do concretely on a multilingual site?
Prioritize native language URLs if your audience is linguistically homogeneous. A Japanese site for Japanese users benefits from using URLs in kanji/hiragana. Immediate readability takes precedence over encoded length.
If you manage a site with multiple language versions, structure your URLs by subdomain or subdirectory with hreflang, allowing each version to use its natural alphabet. Example: fr.site.com/chaussures, ru.site.com/обувь, ja.site.com/靴. There’s no need to standardize everything in Latin.
What mistakes should you avoid in implementation?
Do not confuse readability with brevity. A short but cryptic URL ('/p/12345') is worse than a long but explicit URL ('/category/sub-category/detailed-product'). Google values semantics, not blind conciseness.
Avoid chaotic mixed URLs. If you use native characters, remain consistent throughout the structure. A mix of '/category/товары/product' can confuse users and complicate internal linking. Choose a reference language for the overall architecture.
How can you check that your URLs are correctly indexed?
Use the Search Console to audit encoded URLs. Ensure they appear correctly in the coverage report and that none are blocked by a 414 error or normalization issue. Google displays URLs in their decoded form in the console.
Test social sharing: copy an encoded URL and paste it into different tools (Slack, email, Twitter). Some clients truncate or break the encoding. If your audience shares your content widely, this test is critical.
- Audit your current URLs: are they artificially transliterated when your audience speaks a non-Latin language?
- Test encoding on a pilot page before migrating the entire architecture
- Check compatibility with your CMS: some older systems poorly handle non-ASCII characters
- Configure hreflang correctly if you use URLs in different languages
- Monitor server logs for potential 414 errors (URI too long)
- Document your URL convention for multilingual editorial teams
❓ Frequently Asked Questions
Une URL en caractères cyrilliques ou japonais pénalise-t-elle mon site ?
Dois-je translittérer mes URLs multilingues en caractères latins ?
Quelle est la longueur maximale acceptable pour une URL ?
L'encodage percent-encoding (%E2%80%93) affecte-t-il le crawl ?
Les URLs courtes restent-elles un avantage SEO ?
🎥 From the same video 9
Other SEO insights extracted from this same Google Search Central video · duration 59 min · published on 23/08/2017
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.