Why isn't Google indexing all of your discovered URLs?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

When many URLs fall under the 'discovered, currently not indexed' category, it means that Google has crawled the site and seen these URLs, but is not convinced that indexing them will provide value to users. The emphasis should be on quality over quantity, not on technical aspects. Client-side rendering is generally not the issue.

37:28

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h02 💬 EN 📅 04/12/2020 ✂ 15 statements

Watch on YouTube (37:28) →

✂ Other statements from this video 14 ▾

📅

Official statement from December 4, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Is Google really missing pages from your site that should be indexed? Gary Illyes · February 22, 2024 View statement →

TL;DR

Google crawls many URLs without indexing them in return. When a page lands in the 'discovered, currently not indexed' status, it signals insufficient perceived quality — not a technical issue. Focus on the user value of each piece of content rather than multiplying pages or addressing a hypothetical indexing problem on the server side.

What you need to understand

What does 'discovered, currently not indexed' really mean?

This status appears in the Search Console when Googlebot has explored a URL, extracted the content, but consciously decides not to include it in its index. This is not a bug — it's an active decision from the engine.

Unlike 404 errors or robots.txt blocks, these URLs are technically accessible. Google has read them, analyzed them, but deemed them non-priority or insufficiently distinctive to deserve a spot in the global index.

Why is this a quality signal rather than a technical one?

Mueller emphasizes a crucial point: the problem is not in your infrastructure. Your server is responding, your HTML is clean, your canonical tags are functioning. The engine is simply telling you that these pages do not bring anything new or useful to users.

This represents a form of algorithmic Quality Rater. Google has a limited crawl budget and an index that cannot accommodate all the URLs on the web — it prioritizes those that stand a genuine chance of satisfying a query.

Client-side rendering, often blamed, is usually not the culprit. If Google crawls the URL and sees it, it has rendered the JavaScript. The refusal to index occurs after this step, during the automated editorial evaluation.

What types of content end up in this category?

Typically, pages that are auto-generated with little unique text, paginated archives without noindex filter, permanently out-of-stock product pages, or landing pages with overly similar single keywords.

You will also find e-commerce filter URLs (color + size + brand + price) creating infinite combinations, or syndicated content that exists elsewhere on the web with more authority. Google has already indexed the same information through a source it deems more reliable.

Editorial signal: Google deems the user value insufficient to justify a presence in the index
Not a technical bug: The crawl went well, JavaScript rendering did too if applicable
Index prioritization: Crawl budget and index budget are two distinct things — one does not guarantee the other
Quality > Quantity: Multiplying URLs does not improve your SEO if they are redundant or poor in value
Relevant contents: Auto-generated pages, combinatorial filters, duplicated or thin content, archives with no added value

SEO Expert opinion

Is this statement consistent with observed practices?

In practice, yes — partially. Sites that trim their weak URLs often see an improvement in their overall visibility. Removing 10,000 indexed pages that have no traffic can indeed boost the 1,000 truly strategic ones.

But there's a blind spot. Google says nothing about the reconsideration delay. Can a URL classified as 'discovered not indexed' rise in the index if the content improves? Or does it require forcing a re-crawl via Submit URL? [To check] — Mueller remains vague on this point.

What nuances should be added?

Stating that client-side rendering is 'generally not the issue' is cautious — but not absolute. On some heavy JavaScript sites with aggressive lazy-loading, Googlebot can crawl the URL without extracting all content blocks. The Search Console status will say 'crawled', but the reading depth remains unclear.

Additionally, the quality perceived by Google is not always objective. Niche expert content, with few monthly searches, may be deemed 'irrelevant' by the algorithm even though it would perfectly satisfy the few concerned users. Quantitative filters sometimes overshadow qualitative relevance.

Finally, Mueller contrasts quality and technical issues too binarily. A site with a catastrophic internal linking will see certain URLs discovered but not indexed simply because they have no internal links — thus zero PageRank distributed. This is both a perceived quality issue (low authority) and an architecture issue (structural invisibility).

In what cases does this rule not apply?

If you're launching a new site with 50 well-crafted foundational pages and all land as 'discovered not indexed', this is probably not a quality issue — it's the lack of external trust signals (backlinks, mentions, domain history).

Similarly, on a news site publishing 20 articles a day, some URLs may remain temporarily unindexed simply because the daily crawl budget is saturated. Here, it is indeed a technical capacity problem, not an editorial judgment.

Warning: Do not confuse 'discovered not indexed' with 'crawled, currently not indexed'. The latter status includes crawled URLs but blocked by noindex, canonical, or soft-404 — there, it is technical.

Practical impact and recommendations

What should you do concretely in response to this status?

First, audit the list of affected URLs in Search Console. Export it, group by page type (category, product, blog, filter...). Identify patterns: is it a whole typology that is being rejected, or isolated pages?

If these are strategic pages (key product pages, pillar articles), enhance their unique content and internal linking. Add FAQ sections, usage guides, comparisons — in short, anything that differentiates this page from a simple description copied from the supplier.

If, on the contrary, these are low-value URLs (old monthly archives, exotic combinatorial filters), block them properly: robots.txt, noindex, or pure deletion if they add nothing. Free up crawl budget for the pages that matter.

What mistakes should absolutely be avoided?

Do not multiply pages in hopes of seeing a few indexed by chance. Google does not operate on raw volume — 10,000 mediocre URLs are not worth 100 solid URLs. You saturate your crawl budget and dilute your authority.

Avoid also over-optimizing pages with no editorial value technically. Adding an XML sitemap, boosting loading speed, or refining structured data will change nothing if the content is empty. Google is clearly telling you that the problem is not there.

Finally, don't count on time to resolve the issue. A URL in 'discovered not indexed' for 6 months will not magically flip into the index without any changes on your part. This status is stable by default.

How can you verify that your actions are bearing fruit?

Monitor the status evolution in Search Console over 4 to 8 weeks after modifications. If you have enriched 50 key product pages, check how many move to 'indexed' in the coverage reports.

Use the URL Inspection tool to request a manual reindexing of modified strategic pages. This accelerates processing, even though Google is not obliged to follow your request.

Export the list of 'discovered not indexed' URLs from Search Console
Identify the types of pages involved and prioritize those with high business potential
Enhance the unique content, add differentiating blocks (FAQ, reviews, guides)
Strengthen internal linking to these pages from already indexed and popular content
Properly block (noindex, robots.txt) or delete URLs with no strategic value
Request reindexing via Search Console for modified pages

Indexing is a privilege granted by Google, not an automatic right. Focus your efforts on creating genuinely useful and distinctive content — that's the only sustainable path. These editorial and structural optimizations require fine expertise and regular monitoring. If you lack time or visibility on priorities, consulting a specialized SEO agency can provide you with an accurate diagnosis and an action plan tailored to your business context.

❓ Frequently Asked Questions

Combien de temps faut-il pour qu'une URL 'découvertes non indexées' bascule dans l'index après amélioration ?

Il n'y a pas de délai garanti. En moyenne, comptez 2 à 6 semaines si vous améliorez significativement le contenu et renforcez le maillage interne. Demander une réindexation via Search Console peut accélérer le processus.

Dois-je supprimer toutes les URLs en 'découvertes non indexées' ?

Non. Analysez d'abord leur valeur stratégique. Si ce sont des pages utiles aux utilisateurs mais jugées faibles par Google, enrichissez-les. Si elles n'apportent rien (filtres exotiques, archives vides), bloquez-les ou supprimez-les.

Le client-side rendering peut-il quand même poser problème pour l'indexation ?

Rarement. Si Google crawle l'URL et la classe 'découvertes', c'est qu'il a rendu le JavaScript. Mais un lazy-loading trop agressif ou des erreurs JS peuvent limiter l'extraction complète du contenu — vérifiez via l'outil Inspection d'URL.

Un sitemap XML aide-t-il à indexer ces URLs ?

Le sitemap facilite la découverte, mais n'influence pas la décision d'indexer. Si Google a déjà crawlé l'URL et décidé de ne pas l'indexer, ajouter cette URL au sitemap ne changera rien sans amélioration du contenu.

Faut-il ajouter des backlinks vers ces pages pour les faire indexer ?

Les backlinks renforcent l'autorité perçue et peuvent aider, mais ne remplaceront pas un contenu faible. Mieux vaut d'abord améliorer la page elle-même, puis renforcer son maillage interne avant de chercher des liens externes.

🏷 Related Topics

indexation crawl budget Search Console qualité contenu URLs découvertes maillage interne PageRank duplicate content

Content Crawl & Indexing AI & SEO JavaScript & Technical SEO Links & Backlinks Domain Name

🎥 From the same video 14

Other SEO insights extracted from this same Google Search Central video · duration 1h02 · published on 04/12/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Trailing slash: Technically Different URLs...

External reviews displayed on the site: no ranking...

« Back to results