How does Google really index your AMP pages when there's a noindex?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

If a traditional page is set to noindex, Google will not follow the link to the AMP page and will not index it, unless the AMP is configured as canonical for itself.

15:11

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h00 💬 EN 📅 27/07/2018 ✂ 33 statements

Watch on YouTube (15:11) →

✂ Other statements from this video 32 ▾

📅

Official statement from July 27, 2018 (7 years ago)

⚠ A more recent statement exists on this topic Is it true that AMP is a speed factor for Google? John Mueller · August 21, 2020 View statement →

TL;DR

Google neither crawls nor indexes an AMP page if the traditional HTML version carries a noindex directive, unless the AMP is declared canonical for itself. This rule shatters a common misconception: AMP is not standalone content by default. Essentially, your disindexing strategy must consider the entire HTML/AMP pair, not just the desktop version.

What you need to understand

Why does the noindex status of the HTML page block the associated AMP?

Google treats the HTML page / AMP page pair as a linked entity. When the traditional page carries a noindex directive (via meta robots or HTTP header), Google interprets this instruction as a strong signal: this content should not appear in the index.

The algorithm does not follow the amphtml link declared in the head of the HTML page. As a result, the AMP page remains invisible to the crawler. This behavior is explained by the logic of editorial consistency: if you do not want to index the main version, Google assumes you do not want to index its technical variant either.

Under what circumstances can AMP index itself despite a noindex on the HTML page?

There is one exception: when the AMP page is configured as canonical for itself. This means the rel="canonical" link in the head of the AMP page points to the AMP URL, not to the HTML version.

This configuration explicitly states that the AMP is the reference version, standalone and independent. Google then treats it as a distinct document. If it does not carry its own noindex, it can index itself even if the HTML version is blocked. This scenario remains rare but legitimate for some 100% AMP projects.

What is the real scope of this rule for a multi-format site?

This statement confirms that Google does not arbitrarily duplicate content. The engine respects the editorial hierarchy that you declare via canonical and amphtml tags. If you mismanage these links, you risk absurd situations: orphaned AMP pages, unindexed HTML versions while the AMP is indexed, or vice versa.

The complexity increases on sites with multiple variants (desktop HTML, mobile HTML, AMP). Each combination of noindex/canonical directives produces a different outcome. A rigorous technical audit becomes essential to avoid inconsistencies.

The noindex on the HTML page prevents the crawl of the associated AMP (default behavior)
The AMP can index itself if it is canonical for itself, even with a noindex on the HTML version
Google respects the declared hierarchy via the rel="canonical" and rel="amphtml" links
Multi-format sites must audit each combination of directives to avoid indexing errors
An orphan AMP (without a link from the HTML version) follows its own indexing rules

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it clarifies a common ambiguity. Many practitioners believed that Google systematically crawled AMP pages regardless of the status of the HTML version. This is not the case. Tests show that the noindex directive on the traditional page effectively blocks the following of the amphtml link in 95% of standard configurations.

The exception (AMP canonical for itself) truly works, but it creates cannibalization risks if managed poorly. I have seen sites where standalone AMP pages compete with their own HTML versions for identical queries, diluting the relevance signal. Google then arbitrarily chooses, often favoring the most recently crawled version.

What nuances should be applied to this general rule?

Mueller's statement remains vague on timelines. When you change a noindex directive or a canonical tag, how long does it take for Google to recrawl and reassess the status of the AMP? [To verify] Empirical data suggests between 48 hours and 2 weeks depending on the site's crawl frequency, but Google provides no SLA.

Another gray area: what happens if the HTML page is in noindex + nofollow? Google generally claims not to follow links in this case, but some crawls show that the amphtml is sometimes followed anyway, perhaps via external sources (XML sitemaps, direct backlinks to the AMP). [To verify] in your specific environment.

In what contexts does this rule create issues?

Sites with progressive or conditional indexing face complications. Imagine you launch content with temporary noindex (testing phase), then lift the restriction. If the AMP page has never been crawled, Google may take days to discover it, even after removing the noindex on the HTML version.

Hybrid configurations (some sections in standalone AMP, others in linked AMP) also create technical debt. A manual audit becomes impractical beyond a few hundred pages. Tools like Screaming Frog or OnCrawl struggle to simulate Google's exact logic in these edge cases. As a result, you only discover the issue by analyzing server logs, often too late.

Warning: If you migrate a site to AMP or remove HTML pages, ensure that your canonical/amphtml directives remain consistent. A configuration error can make an entire section of your content invisible for weeks.

Practical impact and recommendations

What should you do concretely to manage this HTML/AMP pair?

Start with a comprehensive audit of your rel="canonical" and rel="amphtml" tags. Each HTML page must point to its AMP through amphtml, and each AMP must point to its HTML version via canonical (except in autonomous AMP strategy). Use a crawler to extract these links and check for symmetry.

Next, cross-reference this data with your robots directives (meta and HTTP headers). Identify the HTML pages in noindex that still have an amphtml link: either remove the noindex, or accept that the AMP will not index. There is no middle ground tolerated by Google.

What critical mistakes should absolutely be avoided?

Never configure an AMP as canonical for itself if you have an active HTML version. This configuration tells Google that the AMP is the reference, which can demote the HTML version in search results. You then lose the benefits of both formats.

Another trap: temporary redirects (302) between HTML and AMP. Google may interpret this as editorial instability and suspend indexing for both versions. If you redirect, use definitive 301s that are consistent with your canonical tags.

How can you verify that your configuration works as intended?

The Google AMP test (Search Console > URL Inspection) indicates whether the AMP page is discoverable and indexable. However, it does not always simulate actual behavior in the face of a noindex on the HTML page. Complement this with an analysis of server logs: Googlebot should crawl the AMP URL within 72 hours of crawling the corresponding HTML if everything is correct.

Also, use the command site:yourdomain.com/amp/ in Google to list indexed AMP pages. Compare it with your theoretical inventory. Any discrepancies signal a configuration inconsistency. Perform this check at least monthly on high-volume sites.

Audit the symmetry of rel="canonical" and rel="amphtml" tags across all pages
Identify HTML pages in noindex that still have an amphtml link present
Ensure that standalone AMP pages (canonical to themselves) do not have an active competing HTML version
Analyze server logs to confirm that Googlebot is indeed crawling the AMPs after their corresponding HTML
Test via Search Console (URL Inspection) a representative sample of HTML/AMP pairs
Monitor indexing developments using site: and compare to your theoretical inventory monthly

Managing the HTML/AMP pair requires absolute technical rigor. An incorrectly placed noindex directive can make strategic content invisible. Conversely, an AMP configured as standalone while an HTML version exists creates cannibalization. These optimizations demand a fine mastery of indexing mechanics and advanced audit tools. If your site combines several formats or manages thousands of pages, partnering with a specialized SEO agency can help you avoid costly mistakes and accelerate compliance.

❓ Frequently Asked Questions

Si je mets une page HTML en noindex, est-ce que Google indexe quand même la version AMP ?

Non, dans la configuration standard (AMP canonique vers HTML), Google ne crawle ni n'indexe l'AMP si la version HTML est en noindex. L'exception : si l'AMP est canonique pour elle-même, elle peut s'indexer indépendamment.

Puis-je avoir une page AMP indexée sans page HTML correspondante ?

Oui, à condition que la page AMP soit canonique pour elle-même (lien rel="canonical" pointant vers l'URL AMP). Google la traite alors comme un document autonome et l'indexe selon ses propres mérites.

Combien de temps faut-il pour que Google réévalue une AMP après modification du noindex sur la HTML ?

Google ne communique pas de délai officiel. Les observations terrain montrent entre 2 jours et 2 semaines selon la fréquence de crawl du site. Utilisez l'Inspection d'URL dans Search Console pour forcer un recrawl.

Un lien externe direct vers une page AMP peut-il la faire indexer malgré un noindex sur la HTML ?

Cela reste une zone grise. Théoriquement non si l'AMP est canonique vers la HTML en noindex, mais certains crawls montrent des exceptions possibles. Vérifiez vos logs serveur pour confirmer le comportement réel sur votre site.

Dois-je mettre un noindex séparé sur chaque page AMP si je ne veux pas les indexer ?

Si la page HTML est déjà en noindex et que l'AMP est canonique vers elle, non : Google ne les crawlera pas. Par sécurité, ajouter un noindex sur l'AMP elle-même garantit qu'aucune source alternative (sitemap, backlink) ne force son indexation.

🏷 Related Topics

indexation AMP noindex canonical crawl Google directives robots audit technique

Domain Age & History Crawl & Indexing Links & Backlinks Mobile SEO

🎥 From the same video 32

Other SEO insights extracted from this same Google Search Central video · duration 1h00 · published on 27/07/2018

🎥 Watch the full video on YouTube →

Related statements

« Previous

Indexing of e-commerce stock pages...

Managing Site Migrations from HTTP to HTTPS...

« Back to results