Official statement
Other statements from this video 9 ▾
- □ Pourquoi Google n'indexe-t-il jamais l'intégralité d'un site web ?
- □ Pourquoi vos pages restent-elles en 'Découvert - actuellement non indexé' ?
- □ Faut-il vraiment attendre que Google indexe vos pages ?
- □ Comment Googlebot ajuste-t-il sa vitesse de crawl en fonction des performances de votre serveur ?
- □ Comment diagnostiquer les problèmes serveur qui freinent le crawl de Google ?
- □ Les problèmes de serveur ne touchent-ils vraiment que les très gros sites ?
- □ Pourquoi Google refuse-t-il d'indexer vos pages en statut 'Découvert' ?
- □ Le maillage interne suffit-il vraiment à faire indexer vos pages découvertes ?
- □ Faut-il vraiment se préoccuper des pages non indexées par Google ?
Google may decide not to process URLs it has discovered if a recurring pattern of low-quality content is detected on those URLs. These pages remain stuck in 'Discovered' status in Search Console — crawled but never indexed. It's a form of silent penalty that doesn't affect the entire site but only sections identified as problematic.
What you need to understand
What does the 'Discovered' status actually mean?
In Google Search Console, the status 'Discovered – currently not indexed' indicates that Googlebot knows a URL exists — through an internal link, sitemap, or redirect — but has chosen not to crawl or index it.
This isn't a technical bug. It's an algorithmic decision: Google estimates that crawl budget would be better spent elsewhere. When this status affects dozens or hundreds of URLs with a common pattern (same URL structure, same content type), it's rarely coincidental.
How does Google detect a low-quality pattern?
Google doesn't explicitly say, but we can piece together several signals: high bounce rate on these pages if they were initially crawled, absence of backlinks, duplicate or thin content, excessive crawl time for perceived low return.
The algorithm learns quickly. If the first 10 URLs in a given category are judged uninteresting, Google can decide to skip the next 1000 that share the same structure. It's an economic extrapolation: why crawl what will probably never rank well?
What types of content are targeted as a priority?
- Auto-generated tag pages without editorial curation
- Product sheets without stock or abandoned (e-commerce)
- Indexable internal search results pages
- Date-based archives without added value
- Unmoderated user-generated content (forums, reviews)
- Parameterized variants of the same page (filters, sorts)
SEO Expert opinion
Is this statement consistent with real-world observations?
Absolutely. For years, we've seen sites with thousands of URLs in 'Discovered' status that never move, even after repeated sitemap submissions. What's new here is that Martin Splitt confirms what many suspected: this isn't a crawl capacity issue, it's a deliberate choice by Google based on a detected pattern.
What's interesting — and frustrating — is that Google doesn't tell you which specific pattern it detected. You have to guess. Is it the URL structure? The content? User behavior? Probably a mix. [To verify]: no official data on the precise thresholds or criteria.
In which cases does this rule not apply?
If your site has strong authority and a history of quality content, Google will be more tolerant. An established media outlet can afford some weak sections without everything being boycotted. A small, new site, however, cannot.
Another case: strategically linked URLs with external backlinks will probably not fall into this trap, even if they share a pattern with other ignored pages. Google weights its decisions.
Should this be seen as a penalty?
Not in the classical sense. It's not a manual action, and it doesn't affect the entire site. But it's still a sanction: Google is implicitly telling you that you're generating too many pages without value and it's going to sort things out for you.
The real problem? You lose control. It's impossible to know precisely which URLs are blacklisted or to force a re-evaluation easily. Google leaves you in the dark — and it's frustrating for an SEO who likes to properly pilot their indexation.
Practical impact and recommendations
What should you do if hundreds of URLs are stuck in 'Discovered' status?
First, identify the pattern. Export your 'Discovered' status URLs from Search Console, group them by structure (regex, prefix, category). Look at what they have in common: URL, content, age, internal linking.
Then, ask yourself the real question: do these pages really deserve to be indexed? Let's be honest, in 70% of cases, the answer is no. If it's low-quality auto-generated content, it's better to noindex it properly or delete it. Google is doing you a favor by not indexing them — don't force it to.
How do you restart indexation of legitimate pages?
If the content is truly quality but ignored by association, several levers to pull:
- Substantially enrich the content (not just 50 more words, real editorial value)
- Improve internal linking from already crawled and trusted pages
- Change the URL structure to break the suspicious pattern (clean 301 redirect)
- Obtain external backlinks to some of these pages to signal their value
- Manually submit a small batch via the Indexing API (not the sitemap — too passive)
Don't submit 500 URLs at once. Start with 10-20 of the best ones, improved and reinforced. If Google indexes them, it means the quality signal got through. Deploy progressively.
What mistakes must you absolutely avoid?
Don't attempt to force indexation via automated ping tools or by spamming the API. Google detects these manipulations and it will only make the problem worse.
Don't leave thousands of URLs stuck in 'Discovered' status indefinitely. It pollutes your crawl budget and sends negative signals about site governance. Better to have a 1,000-page site that's all well-indexed than a 10,000-page site where 8,000 are ignored.
❓ Frequently Asked Questions
Combien de temps faut-il pour que Google réévalue un pattern d'URLs ignorées ?
Le statut 'Découvert' affecte-t-il le reste du site ?
Peut-on forcer l'indexation avec l'outil d'inspection d'URL ?
Les sitemaps servent-ils encore à quelque chose dans ce contexte ?
Faut-il supprimer ou noindexer les pages en 'Découvert' ?
🎥 From the same video 9
Other SEO insights extracted from this same Google Search Central video · published on 20/08/2024
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.