Official statement
Other statements from this video 23 ▾
- □ Google compte-t-il vraiment tous les liens visibles dans Search Console ?
- □ Faut-il vraiment concentrer son contenu sur moins de pages pour ranker ?
- □ Les critères d'avis produits Google s'appliquent-ils même si votre site n'est pas classé comme site d'avis ?
- □ L'API Indexing de Google fonctionne-t-elle vraiment pour tous les contenus ?
- □ L'E-A-T influence-t-il vraiment le classement Google ou n'est-ce qu'un mythe ?
- □ Les mentions de marque sans lien ont-elles un impact sur votre référencement ?
- □ Les commentaires d'utilisateurs améliorent-ils vraiment le classement dans Google ?
- □ Les certificats SSL premium influencent-ils vraiment le référencement Google ?
- □ PDF et HTML avec le même contenu : faut-il craindre une cannibalisation dans les SERPs ?
- □ Peut-on vraiment piloter l'indexation des PDF via les headers HTTP ?
- □ Faut-il encore utiliser rel=next et rel=prev pour la pagination ?
- □ Googlebot peut-il vraiment indexer vos contenus en défilement infini ?
- □ Faut-il vraiment indexer toutes les pages de son site ?
- □ Faut-il s'inquiéter de la page référente affichée dans Google Search Console ?
- □ Faut-il vraiment rediriger l'ancien sitemap en 301 ou soumettre le nouveau directement ?
- □ Pourquoi 97% de crawl refresh est-il un signal positif pour votre site ?
- □ Comment Google détermine-t-il réellement la vitesse de crawl de votre site ?
- □ Vitesse de crawl et Core Web Vitals : pourquoi Google fait-il la distinction ?
- □ Pourquoi Google ralentit-il son crawl après un changement d'hébergement ?
- □ Le paramètre de taux de crawl est-il vraiment un plafond et non un objectif ?
- □ Le CTR peut-il vraiment pénaliser le reste de votre site ?
- □ Le maillage interne est-il vraiment l'élément le plus déterminant pour le SEO ?
- □ Le linking interne agit-il vraiment instantanément après recrawl ?
Google never crawls and indexes all pages on a website — this is completely normal. The 'Discovered - currently not indexed' status can persist indefinitely without being a cause for concern. For new sites with large volumes of content, this phenomenon is expected and part of the natural discovery process.
What you need to understand
This statement reminds us of a reality that many SEO professionals forget: Google has never promised to index everything you publish. Crawling and indexing are limited resources, and the search engine makes choices.
Why doesn't Google crawl all your pages?
The crawl budget — this allocation of resources that Google grants to each website — is not infinite. Google prioritizes pages it considers important based on several criteria: popularity, freshness, perceived quality, and depth in the site structure.
For a site with 10,000 pages, it is common for only 6,000 to 8,000 to be regularly crawled. The rest? Waiting, sometimes indefinitely.
What does the 'Discovered - currently not indexed' status really mean?
This status appears in Search Console when Google has detected the existence of a URL (via an internal link, sitemap, or external mention) but hasn't deemed it a priority to crawl or index it.
Contrary to what some believe, this is not necessarily a quality issue. It can simply be an arbitrage of resources. A page discovered a month ago on a new site will wait its turn — sometimes indefinitely if it remains 4 clicks away from the homepage.
Are new sites particularly affected?
Absolutely. A new site with 500 pages all at once will experience progressive indexation over several weeks or even months. Google doesn't immediately trust the site and carefully manages its crawl.
This is where the site earns its crawl budget: by showing that it produces consulted content, by acquiring backlinks, by proving its relevance. Without that, part of the catalog will remain in passive discovery.
- Google prioritizes its crawl resources based on the perceived importance of pages
- The 'Discovered - currently not indexed' status is not a penalty or a systematic signal of poor quality
- New sites undergo an observation phase where indexation is intentionally slowed down
- A page can remain indefinitely discovered without ever being indexed if it doesn't provide differentiated value
- Indexation time depends on link depth, update frequency, and popularity signals
SEO Expert opinion
Does this statement match real-world observations?
Yes — and that's even understating it. On e-commerce sites with tens of thousands of product pages, we regularly see 30 to 40% of the catalog remain in passive discovery. And this isn't always quality-related: sometimes these are perfectly valid pages, simply buried 5 clicks deep or with few backlinks.
The problem is that Mueller remains vague about the exact prioritization criteria. We know depth matters, backlinks help, freshness plays a role — but the thresholds? The weightings? [Needs verification] on each project, because Google doesn't disclose them.
When should you really worry about the 'Discovered - not indexed' status?
Let's be honest: if your strategic pages — those that should rank and convert — remain stuck in discovery, that's a red flag. Don't panic about peripheral pages (legal notices in PDF format, 2015 blog archives), but a flagship product page remaining unindexed for 3 months? There's an issue.
Common causes: catastrophic internal linking, duplicate or near-duplicate content that triggers URL consolidation, internal cannibalization, or simply a page too thin on unique content to justify indexation.
Does Google provide enough tools to diagnose this problem?
No. Search Console displays the status but never explains why a page remains in discovery. Is it a crawl budget issue? Quality? Depth? Duplication? You have to guess.
This is where server log analysis becomes essential. If Googlebot never visits certain sections, the problem is structural — linking architecture, robots.txt, misplaced nofollow tags. If Googlebot visits but doesn't index, it's a quality or relevance signal.
Practical impact and recommendations
What should you do to speed up indexation of strategic pages?
First priority: reduce link depth. If your important pages are 4-5 clicks from the homepage, Google considers them secondary. Move them up in the information architecture, add links from the main navigation or high-crawl pages.
Second lever: improve contextual internal linking. A page linked from 10 relevant blog articles with varied anchor text sends a much stronger value signal than an isolated page buried in the sitemap.
Third lever — and this is often overlooked: clean up unnecessary pages. If your site contains 5,000 URLs with 2,000 adding no value (archives, faceted filters with no content, old unoptimized landing pages), you dilute your crawl budget. Noindex, 404, or consolidate them.
How do you know if the problem is crawl budget or quality?
Analyze your server logs. If Googlebot never visits certain sections, it's a crawl budget or structure issue. If Googlebot visits every week but still doesn't index, it's a quality signal.
Also test forced indexation via Search Console (request indexation). If Google consistently refuses, it's judging the page as non-relevant — thin content, duplication, cannibalization.
What mistakes should you absolutely avoid?
Don't overwhelm Google with sitemaps of 50,000 URLs where half are worthless. Google will crawl some, discover many pages are weak, and reduce your overall crawl budget.
Don't create generic content just to fill pages. A product page with 30 words of copy lifted from a supplier has a better chance of remaining undiscovered than a page with 300 unique, well-structured words.
Avoid flat architectures with everything 1 click away: that doesn't work either. Google needs semantic hierarchy to understand what's prioritized.
- Internal linking audit: verify that strategic pages are maximum 3 clicks from the homepage
- Log analysis to identify sections never or rarely crawled
- Cleanup of useless URLs: noindex, 404, or consolidation of pages with no added value
- Content enrichment for pages stuck in 'Discovered - not indexed' if they're strategic
- Sitemap optimization: submit only truly priority URLs
- Monthly tracking of indexation rate by page type in Search Console
- Test forced indexation to diagnose qualitative rejection vs. simple delay
❓ Frequently Asked Questions
Combien de temps peut durer le statut 'découvert mais non indexé' ?
Faut-il supprimer les pages en 'découvert non indexé' de son sitemap ?
Un nouveau site doit-il attendre combien de temps avant que toutes ses pages soient indexées ?
Le crawl budget est-il le seul facteur qui explique ce phénomène ?
Comment forcer Google à indexer une page bloquée en 'découvert non indexé' ?
🎥 From the same video 23
Other SEO insights extracted from this same Google Search Central video · published on 18/02/2022
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.