Why does Google suddenly show more non-indexed URLs in Search Console?

Official statement

Google now prominently displays discovered but non-indexed URLs in Search Console. This is not a change in indexing itself but in the way this information is reported. Google has always been selective and cannot index the entire web.

1:06

🎥 Source video

Extracted from a Google Search Central video

⏱ 37:34 💬 EN 📅 12/06/2020 ✂ 18 statements

Watch on YouTube (1:06) →

✂ Other statements from this video 17 ▾

3:11 Le crawl budget : pourquoi Google ne crawle-t-il qu'une fraction de vos pages connues ?
5:17 Core Web Vitals : pourquoi vos tests en laboratoire ne servent-ils à rien pour le ranking ?
9:30 Le contenu généré par les utilisateurs engage-t-il vraiment la responsabilité SEO du site ?
11:03 Faut-il vraiment inclure toutes vos pages dans un sitemap général ?
12:05 Le crawl budget varie-t-il selon l'origine du contenu ?
13:08 Googlebot envoie-t-il un referrer HTTP lors du crawl de votre site ?
14:09 La qualité des images influence-t-elle vraiment le ranking dans la recherche web Google ?
18:15 Comment Google évalue-t-il vraiment l'importance de vos pages via le linking interne ?
20:19 Pourquoi un site bien positionné peut-il perdre sa pertinence sans avoir commis d'erreur ?
21:53 Les Core Web Vitals sont-ils vraiment un facteur de ranking ou juste un écran de fumée ?
22:57 Discover fonctionne-t-il vraiment sans critères techniques stricts ?
25:02 Retirer des pages d'un sitemap peut-il limiter leur crawl par Google ?
27:08 Faut-il vraiment utiliser unavailable_after pour gérer le contenu temporaire ?
30:11 Le structured data influence-t-il réellement le ranking dans Google ?
31:45 Pourquoi Google indexe-t-il parfois vos pages AMP avant leur version HTML canonique ?
33:52 Les Core Web Vitals sont-ils vraiment décisifs pour le ranking Google ?
35:51 Google voit-il vraiment le contenu chargé dynamiquement après un clic utilisateur ?

What you need to understand

Has Google changed its indexing criteria?

No. The indexing algorithm hasn't changed a bit. What John Mueller clarifies is that the change only concerns the visibility of data in Search Console. In other words, the URLs that Google discovers but chooses not to index have always existed—they just weren't as prominent in the reports.

Before this reporting update, many SEOs only saw part of the iceberg. Now, Search Console explicitly shows the discovered but excluded URLs. This isn't a problem in itself; it's just that Google decided to be more transparent about its selective filtering.

What does it really mean when we say “Google cannot index the entire web”?

Google crawls billions of pages, but indexing is resource-intensive (storage, computation, relevance). Hence, it filters. Some URLs are discovered (via a sitemap, an internal link, or a backlink) but deemed irrelevant, duplicated, too low in quality, or simply unnecessary for its users.

What Google calls “being selective” is actually a constant balancing act between crawl budget, duplicate content, thin content, canonicalization. A discovered page is not an indexed page—and many sites overlook this. Seeing these non-indexed URLs in Search Console is just Google finally showing you what it chose to leave out.

Should we be worried about the surge in non-indexed URLs?

Let's be honest: if you see a spike of several thousand discovered but non-indexed URLs, your first reaction is panic. But before you break everything, ask yourself the question: did these URLs really deserve to be indexed?

In many cases, these pages are annoying URL parameters, poorly managed e-commerce filters, wild pagination, WordPress archives that no one bothered to exclude correctly. If Google discovers them but doesn't index them, it might just be doing its job well. The problem arises when strategic pages end up in this lot—and then you need to dig deeper.

Search Console reporting is more transparent, but indexing itself hasn't changed.
Google has always been selective: discovering a URL does not guarantee its indexing.
Seeing non-indexed URLs isn't necessarily an alarm signal—it depends on which ones.
Analyzing the nature of these URLs is essential before panicking or completely overhauling everything.
If strategic pages are excluded, that's where you need to investigate (quality, duplication, canonicalization, robots.txt, noindex).

SEO Expert opinion

Is this statement consistent with what we observe in the field?

Yes, and it’s even reassuring. For years, we’ve known that Google crawls far more than it indexes. Search Console reports have always been partial on this point: some exclusion signals were vague, while others were completely absent. This reporting update merely confirms what we were already seeing in server logs—hundreds, if not thousands, of URLs crawled but never indexed.

What changes now is that Google is putting it right in front of you. Previously, you had to cross-reference logs, sitemaps, GSC reports, and sometimes third-party tools to understand. Now, it's clearly displayed. And that’s a good thing—it forces you to clean up, prioritize, and stop throwing sitemaps of 50,000 URLs where half of them are useless.

What nuances should we add to this statement?

John Mueller says, “Google has always been selective.” That's true. But to what extent? And on what criteria? Here lies the artistic blur. Google never explicitly states why a certain URL is discovered but not indexed. Sometimes it’s obvious (duplicate, thin content); other times, it’s opaque (perceived quality, page authority, thematic context).

[To be checked]: Google claims this change does not affect indexing, but we have seen “reporting adjustments” coincide with indexing fluctuations before. We’ll need to monitor if sites see a real drop in indexed URLs in the weeks to come. This would align with a tightening of the crawl budget or a hardening of quality criteria—but Google will never explicitly say so.

In which cases does this rule not apply or pose problems?

If you have a clean, well-structured site with a nice sitemap and clear strategic URLs, this reporting change shouldn’t affect anything. You may see a few excluded URLs, but nothing alarming. However, if you're managing an e-commerce site with thousands of product variations, dynamic filters, or a media site with poorly managed archives, brace yourself for a shock.

The issue arises when important pages end up non-indexed for unclear reasons. Then you must investigate: content quality, internal duplication, sloppy canonicalization, robots.txt blocking, accidental noindex, or simply a lack of authority on the page. And that’s where it gets tricky—because Google will never tell you precisely why.

Warning: if you see strategic pages (top product sheets, main SEO landing pages) among the discovered but non-indexed URLs, don’t just “force” indexing through the submission tool. Dig into the root cause—otherwise, Google will discard them again at the next crawl.

Practical impact and recommendations

What should you do concretely with these non-indexed URLs?

First step: audit the nature of these URLs. Go to Search Console, export the list of discovered but non-indexed URLs, and see what lies beneath. You’ll often find annoying URL parameters (?sort=, ?color=), wild pagination (/page/42/), empty categories, and worthless WordPress tags. If that’s the case, don’t panic—just exclude them properly.

Next, isolate the URLs that should be indexed. Product sheets, in-depth articles, SEO landing pages. If they’re on the list, that’s where you need to act: check content quality, correct duplications, strengthen internal linking, add strategic internal backlinks, or simply improve relevance.

What mistakes should be avoided in response to this reporting change?

Big mistake number one: panicking and submitting everything for indexing through the GSC tool. This is pointless. If Google has deemed a URL not relevant enough, forcing it won’t change anything in the long run. At best, it will be temporarily indexed and then re-excluded. At worst, you’ll spam Google with unnecessary requests and degrade your crawl budget.

Big mistake number two: completely ignoring this data. Yes, it’s just reporting. But if you have thousands of discovered non-indexed URLs, it likely signals a structural problem: polluted sitemap, poorly structured hierarchy, massive duplicate content, or failing canonicalization. This is an opportunity to clean up—not to sweep things under the rug.

How to check if my site is managing this indexing filtering well?

Start by cross-referencing Search Console with your server logs. Look at which URLs Googlebot crawls but does not index. If they're useless pages, great. If they're strategic pages, corrections are needed. Next, check your sitemap: remove any URLs you don’t want to be indexed (yes, it sounds stupid, but many just throw everything in).

Then, work on the internal quality and authority of pages you want to index. Strong internal linking, unique and substantial content, clean canonicalization, no accidental noindex. And above all, stop creating URLs like crazy—every additional URL dilutes your crawl budget and authority.

Export the list of non-indexed discovered URLs from Search Console
Identify the irrelevant URLs (parameters, filters, pagination) and exclude them properly (robots.txt, noindex, canonical)
Spot the non-indexed strategic pages and investigate the cause (quality, duplication, weak internal linking)
Clean the sitemap: submit only the URLs you genuinely want to index
Enhance internal linking and authority of priority pages
Monitor the evolution of the volume of non-indexed URLs over several weeks to detect trends

This reporting change isn't a catastrophe, but a signal. If you see a surge in non-indexed URLs, it’s an opportunity to clean up, prioritize your SEO efforts, and understand how Google perceives your site. However, pinpointing exactly why certain strategic pages are excluded may require specialized skills and advanced tools. If you find that important pages aren’t passing or if the extent of cleaning overwhelms you, consulting a specialized SEO agency can save you time and prevent costly mistakes—especially if your site handles thousands of URLs.

❓ Frequently Asked Questions

Ce changement de reporting signifie-t-il que Google indexe moins de pages qu'avant ?

Non. Google affirme que l'indexation elle-même n'a pas changé, seule la visibilité de ces données dans Search Console a évolué. Les URLs découvertes mais non indexées existaient déjà, vous les voyez simplement mieux maintenant.

Dois-je forcer l'indexation des URLs découvertes mais non indexées via l'outil de soumission GSC ?

Non, sauf si vous êtes certain que ces URLs méritent d'être indexées et que vous avez corrigé la cause de leur exclusion. Forcer l'indexation sans corriger le problème sous-jacent ne sert à rien à long terme.

Comment savoir si les URLs non indexées sont vraiment un problème pour mon SEO ?

Analysez la nature de ces URLs. Si ce sont des paramètres, des filtres ou des paginations inutiles, pas de souci. Si ce sont des fiches produits, des articles stratégiques ou des landing pages clés, il faut investiguer et corriger.

Faut-il retirer ces URLs non indexées de mon sitemap ?

Oui, absolument. Un sitemap doit contenir uniquement les URLs que vous voulez indexer. Soumettre des URLs que Google écarte ne fait que polluer votre crawl budget et brouiller les signaux.

Ce changement peut-il impacter mon trafic SEO à court terme ?

Normalement non, puisque Google affirme que l'indexation elle-même n'a pas changé. Mais surveillez vos positions et votre trafic sur les prochaines semaines — certains sites pourraient voir des fluctuations si Google ajuste simultanément ses critères de qualité ou de crawl budget.

🎥 From the same video 17

Other SEO insights extracted from this same Google Search Central video · duration 37 min · published on 12/06/2020

🎥 Watch the full video on YouTube →