Official statement
Other statements from this video 16 ▾
- 4:03 Pourquoi un contenu de qualité ne garantit-il pas un bon classement dans Google ?
- 7:37 Faut-il encore prévoir un fallback JavaScript pour le lazy loading natif ?
- 9:21 HTTPS améliore-t-il vraiment le référencement ou est-ce un mythe SEO ?
- 11:53 Les URLs en caractères japonais bloquent-elles l'indexation au-delà de 100 pages ?
- 15:27 Peut-on choisir quelle page de son domaine Google affiche dans les SERP ?
- 18:17 Existe-t-il vraiment une limite au nombre d'items dans les carousels de recettes ?
- 21:17 Pourquoi les pages indexées persistent-elles dans site: après la fermeture d'un service ?
- 26:37 Les soft 404 pénalisent-ils vraiment votre SEO global ?
- 29:45 Pourquoi les nouveaux sites basculent-ils automatiquement en mobile-first indexing ?
- 34:38 L'outil de désaveu de liens sert-il vraiment à combattre le negative SEO ?
- 40:54 Google neutralise-t-il vraiment la majorité des liens spam automatiquement ?
- 42:38 L'URL canonique peut-elle changer selon la géolocalisation du visiteur ?
- 45:54 Pourquoi max-image-preview:large est-il indispensable pour Google Discover ?
- 48:25 Un redirect mal configuré puis corrigé peut-il quand même transférer le PageRank ?
- 50:01 Faut-il canonicaliser des pages identiques en contenu mais différentes en apparence visuelle ?
- 54:52 Peut-on forcer Google à afficher une page plutôt qu'une autre pour une même requête ?
Google treats '/' and '/index.html' as two distinct URLs if no redirect or canonical is configured. One will be chosen as the canonical version, but without your control, you risk signal cannibalization and authority dilution. Let's be honest: it's a classic trap that mostly affects static sites and poorly configured old CMSs.
What you need to understand
Does Google really consider / and /index.html as two different pages?
Yes, unequivocally. When Google crawls your domain and finds both https://example.com/ and https://example.com/index.html accessible without redirection or a canonical tag, it indexes them as two distinct entities. This might seem absurd — after all, they serve the same content — but for Googlebot, these are two different URLs.
The engine will then choose one as the canonical version based on its criteria: internal links, backlinks, crawl history. And that’s where the problem lies. If you’ve never specified your preference, Google decides for you. Sometimes it prefers /, sometimes /index.html — and this decision may vary over time if the signals change.
Why does this distinction pose a problem in SEO?
Because each URL accumulates its own ranking signals: backlinks, anchors, page authority. If half of your links point to / and the other half to /index.html, you dilute your PageRank. What does that mean in practice? Your homepage loses power.
On top of that, there’s the risk of duplicate content. Yes, Google is supposed to handle this through automatic canonicalization, but why leave that to chance when you can make the decision yourself? Cases where Google misidentifies the canonical do exist — they’re rare but real. And it's always your homepage that suffers.
In what cases do we encounter this problem?
Mainly on static sites generated by hand, old CMSs (Apache with DirectoryIndex by default), or rough server configurations. Modern platforms (WordPress, Shopify, etc.) handle this natively with 301 redirects or canonical tags. But if you're working on a custom site or an old web heritage, make sure to check.
Some developers leave /index.html accessible "just in case," without understanding the SEO impact. Others forget to configure the server after a migration. The result: two URLs in production, two versions indexed, guaranteed confusion.
- Google treats / and /index.html as two distinct URLs if no directive is in place.
- One will be automatically chosen as canonical, but there’s no guarantee of consistency over time.
- This situation dilutes your ranking signals (backlinks, authority) and can create duplicate content.
- The problem mainly affects static sites, old CMSs, or poorly configured server setups.
- Modern platforms handle this by default — but always check during migrations or technical audits.
SEO Expert opinion
Is this statement consistent with observed practices in the field?
Absolutely. This behavior has been observed for years, and Kanaya Takayuki merely confirms what every technical SEO already knows — or should know. In audits, we frequently encounter homepages indexed under both forms, with backlinks spread randomly. The crawl budget suffers too, as Googlebot explores two URLs instead of one.
What’s interesting is that Google never forces consolidation. It chooses a canonical, sure, but does not block the other URL. As a result, it remains crawlable, sometimes indexed as a duplicate. This creates inconsistencies in Search Console (impressions, clicks spread across two lines) and complicates performance analysis.
What nuances should be added to this rule?
First nuance: the severity depends on the volume of backlinks. If your homepage receives 10,000 links — with 5,000 pointing to / and 5,000 to /index.html — the impact is substantial. If you have 50 in total, it's marginal. Prioritize according to your link profile.
Second nuance: Google can partially consolidate signals even without an explicit canonical. Engineers have previously confirmed that certain signals (like brand mentions) are intelligently aggregated. But “partially” doesn’t mean “totally”. [To verify]: the exact extent of this automatic consolidation remains unclear — Google does not publish metrics on this.
In what cases does this rule not apply?
If your server systematically redirects one of the two URLs to the other (301 permanent), there is only one URL in play. End of story. The same goes if a canonical tag points from /index.html to / (or vice versa): Google generally respects this directive, barring massive contradictory signals.
Also, some modern CMSs completely block access to /index.html by returning a 404 or through transparent rewriting. In this case, the problem simply doesn’t exist. But — and this is where many get it wrong — ensure that this blocking is effective in production, not just locally or in staging.
/index.html indeed returns a 301 or 404, not a 200 with the same content as /. Surprises are frequent after migrations or server updates.Practical impact and recommendations
What should you do concretely to avoid this trap?
First step: audit the current state. Manually type https://yourdomain.com/index.html into a browser and observe the returned HTTP code. If it’s a 200 with the same content as /, you have a problem. Then, crawl with a tool (Screaming Frog, Sitebulb) to see if Googlebot encounters both URLs.
Next, consolidate. The cleanest method: set up a permanent 301 redirect from /index.html to / at the server level (Apache .htaccess, Nginx nginx.conf, or CDN). This redirect should be global and systematic, not conditional. Test in incognito mode and via curl to be sure.
What mistakes should be avoided during consolidation?
Don’t rely solely on the canonical tag. Yes, adding <link rel="canonical" href="https://example.com/" /> in /index.html helps, but it’s a weak directive — Google can ignore it if other signals (massive backlinks to /index.html) point elsewhere. The 301 redirect is non-negotiable.
Avoid JavaScript or meta refresh redirections too. Google follows these redirects, but with delay, and they do not fully transfer PageRank. Some third-party bots (social networks, SEO tools) don’t even follow them. Stick to HTTP 301 server.
How can you verify that your site complies after correction?
Three checks post-deployment. One: curl command line to see the raw code (curl -I https://example.com/index.html should show 301 Moved Permanently with Location: https://example.com/). Two: complete crawl with Screaming Frog to ensure no internal link points to /index.html. Three: Search Console, Coverage section — monitor that /index.html disappears from the index within 2-4 weeks following the redirect.
If you still detect backlinks to /index.html in tools like Ahrefs or Majestic, don’t panic: the 301 will transfer the juice. But ideally, contact referring sites to update to / — each redirect jump costs a bit of PageRank.
- Manually test
/index.htmlin a browser and check the HTTP code (it should be 301, not 200). - Set up a permanent 301 redirect at the server level (Apache, Nginx, CDN) from
/index.htmlto/. - Don’t settle for a canonical tag — server redirection is essential.
- Crawl the site after correction to ensure no internal link points to
/index.html. - Monitor Search Console (Coverage section) to ensure that
/index.htmlis removed from the index within 2-4 weeks. - Ideally, contact sites with backlinks to
/index.htmlto update to/to avoid redirect jumps.
❓ Frequently Asked Questions
Est-ce que WordPress gère automatiquement la distinction entre / et /index.php ?
Peut-on utiliser un canonical tag au lieu d'une redirection 301 ?
Combien de temps faut-il pour que Google consolide les deux URLs après une 301 ?
Ce problème affecte-t-il aussi les sous-répertoires (par ex. /contact/ vs /contact/index.html) ?
Quel impact réel sur le ranking si on ne corrige pas cette duplication ?
🎥 From the same video 16
Other SEO insights extracted from this same Google Search Central video · duration 59 min · published on 02/07/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.