Can you really index a noindex page through a sitemap?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

If you submit a page via a sitemap but it contains a noindex directive, you will receive an error. All these cases would prevent the page from appearing in search results.

2:07

🎥 Source video

Extracted from a Google Search Central video

⏱ 9:28 💬 EN 📅 06/10/2020 ✂ 24 statements

Watch on YouTube (2:07) →

✂ Other statements from this video 23 ▾

📅

Official statement from October 6, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Is Noindex Enough, or Should You Use Noindex+Nofollow to Block SEO Signals? John Mueller · October 7, 2021 View statement →

TL;DR

Google confirms that a page submitted via sitemap with a noindex directive generates an error and will never appear in search results. This technical inconsistency is considered a contradictory signal that Google will not resolve in favor of indexing. For SEOs, this means that a consistency audit between the sitemap and indexing directives becomes essential to avoid wasting crawl budget on URLs intended to remain invisible.

What you need to understand

Why is this error considered blocking?

When you include a URL in your XML sitemap, you explicitly signal to Google: ‘this page deserves to be crawled and indexed.’ It's a strong prioritization signal.

At the same time, if this same page contains a noindex directive (via meta robots or X-Robots-Tag HTTP), you are saying the exact opposite: ‘never show this page in search results.’ Google faces a paradoxical instruction that it can only resolve in favor of the noindex, which is an explicit and priority exclusion directive.

What forms does this error take in Search Console?

This inconsistency appears in the coverage report in Google Search Console under the category ‘Excluded.’ You will typically see the status ‘Page indexable not found (404)’ or more directly ‘Excluded by 'noindex' tag.’

The problem is that as long as the URL remains in the sitemap, Google will continue to crawl it periodically to check if the directive has changed. This results in you wasting crawl budget on pages that will never serve your organic visibility.

What cases most often generate this conflict?

Complex architectures are the most exposed. Noindex pagination pages mistakenly present in the sitemap, poorly declared canonicalized URLs, accidentally indexable staging environments, e-commerce facets excluded on the head but referenced in a poorly filtered dynamic sitemap.

On sites with thousands of pages, this error can affect 5 to 15% of the total sitemap without anyone realizing — until a technical audit reveals it. This is particularly common during CMS migrations where the sitemap generation rules are not reset with the new robots directives.

A page submitted in the sitemap with a noindex will never be indexed, regardless of its quality or authority.
Google considers noindex as a priority and non-negotiable directive — even when a sitemap is present.
This error unnecessarily consumes crawl budget and pollutes your coverage reports.
Automated sitemap generators and CMSs are the main source of this conflict, especially after a migration or redesign.
A consistency audit between sitemap/robots should be performed at least every quarter on dynamic sites.

SEO Expert opinion

Is this rule really applied without exception by Google?

Yes, and it’s one of the few cases where Google leaves no grey area of interpretation. Unlike canonical directives, which can be ignored if Google deems another signal more relevant, the noindex is absolute. No backlink, no popularity signal, no quality content can offset a noindex directive.

I have seen sites with high SEO potential — DR 70+, hundreds of backlinks — completely invisible for months because a noindex was in place while the sitemap listed them. Google will never make an exception, even if the initial intent seems obvious. It’s mechanical.

Why do so many sites accumulate this error without realizing it?

Because sitemap generation tools and CMSs do not automatically cross-check their inclusion rules with robots directives. A WordPress plugin may generate a sitemap based on the types of published posts, while another plugin or an .htaccess rule adds a noindex to certain taxonomies.

As a result, no one sees the conflict until a manual technical audit is launched. Large e-commerce platforms (Magento, PrestaShop, Shopify) are particularly vulnerable, with facets, filter pages, and parameterized URLs ending up in the default sitemap when they should be excluded. [To verify]: Google does not provide public statistics on the frequency of this error, but on-the-ground audits show it affects 60 to 70% of e-commerce sites with over 10,000 items.

What are the real risks for the overall site SEO?

Beyond the simple non-indexation of the pages in question, this conflict sends a signal of poor technical governance. If Google detects hundreds of noindex URLs in your sitemap, it may decrease the crawl frequency, considering your prioritization signals to be unreliable.

Concretely, this can delay the indexing of new strategic pages or slow down the recognition of content updates. It’s not an algorithmic penalty, but a resource allocation: Google will crawl less frequently a site that wastes its time. This is particularly critical during product launches or redesigns.

Attention: A polluted sitemap with noindex pages can reduce crawl frequency by 20 to 40% on some sites, according to field observations. Google will not warn you — it will simply adjust its priorities.

Practical impact and recommendations

How can I quickly identify these conflicts on my site?

First step: export all URLs from your XML sitemap (or your multiple sitemaps if you have several). Then, crawl these URLs with a tool like Screaming Frog, Oncrawl, or Botify in ‘URL list’ mode to check for the presence of noindex directives (meta robots or X-Robots-Tag HTTP).

You can also cross-reference data from the Search Console: in the coverage report, filter for pages ‘Excluded by 'noindex' tag’ and check if they appear in your sitemap. If so, you have an active conflict. For medium-sized sites (5,000 to 50,000 pages), expect to spend 2 to 4 hours on a complete audit — but this is time that will save you months of crawl budget waste.

What corrective actions can be taken immediately?

Two options: either you remove the noindex URLs from your sitemap (quickest solution), or you remove the noindex directive if these pages should indeed be indexed. In 90% of cases, the first option applies: pagination pages, e-commerce filters, tags, or internal search results have no place in a sitemap.

Once corrected, resubmit your sitemap via Search Console and monitor the coverage report for 2 to 3 weeks. Google should gradually reduce the number of pages with errors. If the problem persists, check that no cache or CDN rule is serving an outdated version of your sitemap.

How can I prevent this type of error in the future?

Automate validation. If you generate your sitemap dynamically (via a CMS, plugin, or script), add a verification step that crawls each URL before inclusion and checks for the absence of exclusion directives. Some tools like Sitebulb or Botify allow you to automate this verification in pre-production.

Then, institutionalize a quarterly audit of sitemap/robots consistency, especially after a migration, redesign, or the addition of new features. Document inclusion/exclusion rules in your technical documentation to prevent developers or external agencies from breaking the existing logic. If you’re managing a complex site with multiple teams, these optimizations can become hard to orchestrate alone: hiring a specialized SEO agency can help you structure a robust validation process and provide personalized support for the technical governance of your sitemaps.

Export all URLs from the sitemap and crawl them to detect noindex directives.
Cross-reference Search Console data (pages excluded by noindex) with the sitemap content.
Remove noindex URLs from the sitemap or lift the directive as applicable.
Resubmit the sitemap via Search Console and monitor progress.
Automate pre-inclusion validation in sitemap generation processes.
Plan a quarterly audit of sitemap/robots consistency, especially post-migration.

Ultimately, this error is easy to fix but destructive if ignored. A sitemap polluted with noindex pages unnecessarily consumes crawl budget, slows the indexing of new strategic pages, and sends a signal of poor technical governance to Google. A consistency audit every 3 months and automated pre-inclusion validation are enough to eliminate this risk permanently.

❓ Frequently Asked Questions

Une page en noindex peut-elle quand même être crawlée par Google ?

Oui, une page en noindex peut être crawlée et même apparaître dans les logs serveur, mais elle ne sera jamais indexée ni visible dans les résultats de recherche. Google peut continuer à la visiter pour vérifier si la directive change.

Si je retire le noindex mais laisse l'URL dans le sitemap, combien de temps avant indexation ?

Cela dépend de la fréquence de crawl de votre site. Sur un site actif, comptez 2 à 7 jours. Sur un site à faible autorité ou peu mis à jour, cela peut prendre plusieurs semaines.

Le X-Robots-Tag HTTP est-il traité de la même manière que la meta robots ?

Oui, Google traite les deux comme des directives équivalentes. Un X-Robots-Tag: noindex dans l'en-tête HTTP aura exactement le même effet qu'une balise meta robots noindex dans le HTML.

Peut-on avoir une page en noindex, nofollow dans le sitemap sans risque ?

Non, le problème reste identique. Qu'elle soit en noindex seul ou en noindex, nofollow, une page présente dans le sitemap génère un conflit et consomme du crawl budget inutilement.

Les pages canonicalisées doivent-elles figurer dans le sitemap ?

Non, seule l'URL canonique doit figurer dans le sitemap. Inclure une URL non-canonique génère une erreur similaire : Google la crawlera mais ne l'indexera pas, gaspillant du crawl budget.

🏷 Related Topics

noindex sitemap XML indexation crawl budget Search Console directive robots audit technique couverture

Domain Age & History Crawl & Indexing AI & SEO Search Console

🎥 From the same video 23

Other SEO insights extracted from this same Google Search Central video · duration 9 min · published on 06/10/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Minimum Data Threshold Required to Appear in Core ...

Indexing errors can prevent your appearance in Goo...

« Back to results