Should you really point your canonicals to the main page?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

A canonical tag should not point to a main page if the pages are not exact copies. Each page version should be self-canonical.

16:53

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h04 💬 EN 📅 24/02/2017 ✂ 9 statements

Watch on YouTube (16:53) →

✂ Other statements from this video 8 ▾

2:37 Peut-on vraiment empêcher des concurrents de se classer sur le nom de sa marque ?
3:10 Comment renforcer votre positionnement sur vos propres mots-clés de marque ?
5:17 Google pénalise-t-il un site pour ses erreurs passées ?
10:16 Pourquoi des pages de catégories faibles peuvent-elles pénaliser tout votre site sous Panda ?
11:41 Faut-il vraiment écrire le mot-clé exact pour ranker dessus ?
13:06 Pourquoi l'optimisation des images reste-t-elle indispensable malgré les progrès de l'IA de Google ?
47:21 Faut-il vraiment garder les attributs nofollow sur vos liens sortants ?
56:21 Le HTTPS est-il vraiment indispensable pour un site vitrine sans transactions ?

📅

Official statement from February 24, 2017 (9 years ago)

⚠ A more recent statement exists on this topic Should you really combine hreflang and self-referencing canonicals? John Mueller · June 11, 2019 View statement →

TL;DR

Google emphasizes that a canonical tag should only point to a main page if the pages are nearly identical copies. Ideally, each URL should be self-canonical, except for strict duplication. Using cross-page canonicals to group slightly different variants can dilute your relevance signals and confuse the algorithm about which version to index.

What you need to understand

What does “exact copies” really mean for Google?

When Mueller talks about exact copies, he refers to strict duplications: different URL parameters serving the same content (UTM, session IDs, un-applied filters). Not pages that share 80% common text but differ on key sections.

The classic trap? Canonicalizing a product page available in multiple colors to the “blue” version. If each page has specific reviews, unique images, and a different H1 title, Google sees them as distinct entities. Forcing a canonical is like saying “ignore this unique content,” which sabotages your ability to rank for long-tail queries specific to each variant.

Why is the self-canonical setup the recommended default?

A self-canonical page (canonical pointing to itself) tells Google: “This is the reference version of this specific URL.” It is the safest baseline when there is no proven duplication.

The common mistake is believing a cross-page canonical “reinforces” a main page by consolidating juice. However, Google treats a canonical as a strong directive for deduplication, not as a cosmetic PageRank transfer. If the contents diverge, you create a signaling conflict: page A says “I am B” while it barely resembles B. The result? Unpredictable indexing and loss of positions on keywords specific to A.

In what cases is a cross-page canonical still legitimate?

It is justified for pure technical duplications: HTTP vs HTTPS versions of the same page, trailing slash vs non-trailing slash, mobile/desktop variants served on separate URLs (though an outdated architecture), redundant paginated versions.

A typical example: a blog that fully republishes an article on an adjacent category. If the content is word-for-word identical, pointing to the original via canonical makes sense. Once a variant adds an intro paragraph, changes the title, or includes different calls-to-action, it ceases to be an exact copy and should live its own life.

Cross-page canonical = reserved for strict duplications (parameters, protocol, redundant paths)
Self-canonical = default setup for any page with distinct content, even partially
Common mistake: canonicalizing slightly different product variants thinking it “consolidates” the main one
Consequence: loss of visibility on long-tail queries related to the excluded variants
Practical rule: if you're unsure between canonical or noindex, ask yourself if the page deserves to rank for specific keywords

SEO Expert opinion

Is this statement consistent with observed practices on the ground?

Yes and no. The theory is solid, but Google itself creates ambiguity by sometimes treating canonicals as mere “suggestions.” Documented cases show Google ignoring a cross-page canonical to index the variant instead of the main one. Why? Because on-page signals (content, links, engagement) contradicted the directive.

In concrete terms, if you canonicalize page B to A while B attracts backlinks and generates specific organic traffic, Google might still decide to index B. This is precisely the issue raised by Mueller: you're misleading the algorithm, creating a conflict between your technical directive and the real signals. [To be verified]: Google does not publish any metrics on the rate of canonical overrides, making it impossible to quantify the frequency of these cases.

What nuances should be added to this strict rule?

The definition of “exact copy” remains vague in the gray area. Let's take e-commerce product sheets: a page with available stock vs out of stock, but identical content, is it a “copy”? Technically yes, but some SEOs canonicalize to the in-stock version to avoid cannibalization. Google likely tolerates this case, although Mueller does not specify it.

Another nuance: AMP versions (even though declining). An AMP page typically canonicalizes to the classic HTML version, although structurally they differ. Google accepts this because it's the standard it imposed itself. So the “strict rule” has exceptions, but they are never exhaustively documented, leaving frustrating room for interpretation.

When does this statement mask deeper structural problems?

If you find yourself massively canonicalizing “almost identical” pages to a main one, the real issue is not the tag, it is your architecture. This often reveals an inability to sufficiently differentiate content or a proliferation of redundant URLs that should never have existed.

For example, generating a unique URL for each combination of product filters (color + size + material = 200 URLs) while none have distinct content. The canonical becomes a technical band-aid. The real solution? Noindex these variants or serve them in AJAX without a dedicated URL. Mueller doesn’t say it outright, but his insistence on “self-canonical” implies: stop creating useless pages and then canonizing them to cover up the problem.

Attention: If Search Console reports canonical conflicts (“URL sent marked as duplicate, user did not select the same canonical as Google”), it's the direct symptom of this bad practice. Google ignores you because your directive doesn’t hold up against real signals.

Practical impact and recommendations

How to audit your existing canonicals for errors?

First step: extract all cross-page canonicals via a crawl (Screaming Frog, Oncrawl, Botify). Filter the URLs where canonical ≠ crawled URL. For each pair, compare the content using a textual diff or similarity score (Copyscape, Siteliner). If similarity falls below 95%, you likely have a problem.

Next, cross-reference this data with Search Console: check if Google is actually indexing the canonical or the variant. If Search Console shows impressions/clicks on the variant URL while it canonizes to another, then Google is overriding your directive. This proves that the pages are not similar enough and that your canonical creates confusion rather than clarity.

What corrections should be made based on the identified cases?

For true technical duplications (UTM parameters, trailing slash, HTTP/HTTPS), keep the cross-page canonicals but ensure they consistently point to the HTTPS version, without trailing slash, without parameters — a unique and strict standard.

For variants with distinct content (product sheets by color, slightly rewritten blog articles), remove cross-page canonicals and switch to self-canonical. If you really don’t want to index certain variants, use noindex instead of canonical. It’s more honest: you tell Google “don’t show this page” instead of “this page is identical to another” when it’s not true.

What strategy should be deployed to avoid future errors?

Define a clear policy in your technical documentation: “Cross-page canonical only if text similarity > 98% AND no structural difference (H1, meta, main images).” Integrate this rule into your content creation workflows and CMS templates.

For high-volume sites (e-commerce, classifieds), automate the detection of inconsistencies: a script that compares canonical vs content and alerts if divergence. Better yet, rethink the architecture to limit the creation of redundant URLs upfront. Use non-crawlable parameters, JavaScript for filters, or URL fragments (#) instead of separate paths.

These technical adjustments can quickly become complex at scale, especially when they intersect with architecture, legacy CMS issues, and high volume. Engaging a specialized SEO agency ensures a thorough audit, tailored strategy, and technical support to deploy these changes without risk of regression.

Crawl the site and extract all URLs with cross-page canonicals
Compare the content of pairs (canonical vs variant) using a textual similarity tool
Check in Search Console if Google indexes the canonical or the variant (override)
Remove cross-page canonicals on variants with distinct content (> 5% difference)
Replace with noindex if the variant does not deserve to be indexed
Document a strict similarity policy (> 98%) to allow a cross-page canonical

The canonical tag is not a tool for consolidating PageRank or a means of “choosing” which page to rank. It is a strict deduplication directive reserved for nearly identical copies. Any other use creates signaling conflicts, confuses the algorithm, and leads to loss of positions on keywords specific to excluded variants. Audit, correct, and redefine your technical standards to avoid reproducing these structural errors.

❓ Frequently Asked Questions

Une canonical cross-page transfère-t-elle du PageRank vers la page cible ?

Non, ou du moins ce n'est pas sa fonction première. Google traite la canonical comme une directive de déduplication : il consolide les signaux (liens, contenu) vers une seule URL de référence, mais si les pages diffèrent trop, il peut ignorer la directive et diluer les signaux au lieu de les concentrer.

Peut-on canoniser une page de catégorie vers une fiche produit phare pour booster cette dernière ?

Non, c'est exactement l'erreur que Mueller dénonce. Une catégorie et une fiche produit ont des contenus et des intentions de recherche différents. Forcer une canonical ici désindexerait probablement la catégorie sans renforcer le produit, tout en semant la confusion dans l'algorithme.

Si Google ignore ma canonical et indexe la variante, est-ce un problème ?

Oui, c'est le symptôme d'une incohérence. Cela signifie que ta directive canonical contredit les signaux réels (contenu, liens, engagement). Google te signale que les pages ne sont pas assez similaires. Tu dois soit corriger la canonical, soit accepter que les deux URLs vivent indépendamment.

Faut-il mettre une canonical auto-référente sur chaque page, même sans duplication ?

Oui, c'est une bonne pratique défensive. Une canonical auto-référente clarifie pour Google quelle URL est la version de référence, même en l'absence de duplication évidente. Cela prévient les problèmes si des paramètres ou des chemins alternatifs apparaissent plus tard.

Canonical ou noindex pour des variantes produit qu'on ne veut pas indexer ?

Noindex est plus honnête si le contenu diffère. La canonical dit « cette page est identique à une autre », le noindex dit « ne montre pas cette page ». Si tes variantes ont des contenus distincts mais que tu veux éviter la cannibalisation, noindex est le bon outil, pas canonical.

🏷 Related Topics

canonical duplication contenu indexation auto-canonical architecture site Search Console balises HTML crawl

Domain Age & History Crawl & Indexing

🎥 From the same video 8

Other SEO insights extracted from this same Google Search Central video · duration 1h04 · published on 24/02/2017

🎥 Watch the full video on YouTube →

Related statements

« Previous

The Impact of HTTPS on All Types of Websites...

Strengthening Your Site's Relevance for Brand Keyw...

« Back to results