What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Google now prominently displays discovered but non-indexed URLs in Search Console. This is not a change in indexing itself but in the way this information is reported. Google has always been selective and cannot index the entire web.
1:06
🎥 Source video

Extracted from a Google Search Central video

⏱ 37:34 💬 EN 📅 12/06/2020 ✂ 18 statements
Watch on YouTube (1:06) →
Other statements from this video 17
  1. 3:11 Why does Google only crawl a fraction of your known pages?
  2. 5:17 Core Web Vitals: Why do your laboratory tests fail to impact your ranking?
  3. 9:30 Does user-generated content really expose your site's SEO liability?
  4. 11:03 Should you include all your pages in a general sitemap?
  5. 12:05 Does the source of content affect the crawl budget?
  6. 13:08 Does Googlebot send an HTTP referrer when crawling your site?
  7. 14:09 Does image quality really affect rankings in Google’s web search?
  8. 18:15 How does Google really assess the importance of your pages through internal linking?
  9. 20:19 Is it true that a well-ranked website can lose its relevance without making any mistakes?
  10. 21:53 Are Core Web Vitals truly a ranking factor or just smoke and mirrors?
  11. 22:57 Does Discover really work without strict technical criteria?
  12. 25:02 Can removing pages from a sitemap actually limit their crawling by Google?
  13. 27:08 Should you really use unavailable_after to manage temporary content?
  14. 30:11 Does structured data really influence rankings on Google?
  15. 31:45 Why does Google sometimes index your AMP pages before their canonical HTML version?
  16. 33:52 Are Core Web Vitals truly crucial for Google ranking?
  17. 35:51 Does Google really see the content loaded dynamically after a user clicks?
📅
Official statement from (6 years ago)
TL;DR

Google hasn't changed its indexing algorithm—it's simply the reporting in Search Console that is evolving. Discovered but non-indexed URLs are now more visible in the interface. In practical terms? You'll see a volume of excluded URLs that you may not have noticed before, but that already existed in Google's pipeline. Don't panic: this isn't a degradation of your indexing; it's just that Google is finally showing you what it already ignored.

What you need to understand

Has Google changed its indexing criteria?

No. The indexing algorithm hasn't changed a bit. What John Mueller clarifies is that the change only concerns the visibility of data in Search Console. In other words, the URLs that Google discovers but chooses not to index have always existed—they just weren't as prominent in the reports.

Before this reporting update, many SEOs only saw part of the iceberg. Now, Search Console explicitly shows the discovered but excluded URLs. This isn't a problem in itself; it's just that Google decided to be more transparent about its selective filtering.

What does it really mean when we say “Google cannot index the entire web”?

Google crawls billions of pages, but indexing is resource-intensive (storage, computation, relevance). Hence, it filters. Some URLs are discovered (via a sitemap, an internal link, or a backlink) but deemed irrelevant, duplicated, too low in quality, or simply unnecessary for its users.

What Google calls “being selective” is actually a constant balancing act between crawl budget, duplicate content, thin content, canonicalization. A discovered page is not an indexed page—and many sites overlook this. Seeing these non-indexed URLs in Search Console is just Google finally showing you what it chose to leave out.

Should we be worried about the surge in non-indexed URLs?

Let's be honest: if you see a spike of several thousand discovered but non-indexed URLs, your first reaction is panic. But before you break everything, ask yourself the question: did these URLs really deserve to be indexed?

In many cases, these pages are annoying URL parameters, poorly managed e-commerce filters, wild pagination, WordPress archives that no one bothered to exclude correctly. If Google discovers them but doesn't index them, it might just be doing its job well. The problem arises when strategic pages end up in this lot—and then you need to dig deeper.

  • Search Console reporting is more transparent, but indexing itself hasn't changed.
  • Google has always been selective: discovering a URL does not guarantee its indexing.
  • Seeing non-indexed URLs isn't necessarily an alarm signal—it depends on which ones.
  • Analyzing the nature of these URLs is essential before panicking or completely overhauling everything.
  • If strategic pages are excluded, that's where you need to investigate (quality, duplication, canonicalization, robots.txt, noindex).

SEO Expert opinion

Is this statement consistent with what we observe in the field?

Yes, and it’s even reassuring. For years, we’ve known that Google crawls far more than it indexes. Search Console reports have always been partial on this point: some exclusion signals were vague, while others were completely absent. This reporting update merely confirms what we were already seeing in server logs—hundreds, if not thousands, of URLs crawled but never indexed.

What changes now is that Google is putting it right in front of you. Previously, you had to cross-reference logs, sitemaps, GSC reports, and sometimes third-party tools to understand. Now, it's clearly displayed. And that’s a good thing—it forces you to clean up, prioritize, and stop throwing sitemaps of 50,000 URLs where half of them are useless.

What nuances should we add to this statement?

John Mueller says, “Google has always been selective.” That's true. But to what extent? And on what criteria? Here lies the artistic blur. Google never explicitly states why a certain URL is discovered but not indexed. Sometimes it’s obvious (duplicate, thin content); other times, it’s opaque (perceived quality, page authority, thematic context).

[To be checked]: Google claims this change does not affect indexing, but we have seen “reporting adjustments” coincide with indexing fluctuations before. We’ll need to monitor if sites see a real drop in indexed URLs in the weeks to come. This would align with a tightening of the crawl budget or a hardening of quality criteria—but Google will never explicitly say so.

In which cases does this rule not apply or pose problems?

If you have a clean, well-structured site with a nice sitemap and clear strategic URLs, this reporting change shouldn’t affect anything. You may see a few excluded URLs, but nothing alarming. However, if you're managing an e-commerce site with thousands of product variations, dynamic filters, or a media site with poorly managed archives, brace yourself for a shock.

The issue arises when important pages end up non-indexed for unclear reasons. Then you must investigate: content quality, internal duplication, sloppy canonicalization, robots.txt blocking, accidental noindex, or simply a lack of authority on the page. And that’s where it gets tricky—because Google will never tell you precisely why.

Warning: if you see strategic pages (top product sheets, main SEO landing pages) among the discovered but non-indexed URLs, don’t just “force” indexing through the submission tool. Dig into the root cause—otherwise, Google will discard them again at the next crawl.

Practical impact and recommendations

What should you do concretely with these non-indexed URLs?

First step: audit the nature of these URLs. Go to Search Console, export the list of discovered but non-indexed URLs, and see what lies beneath. You’ll often find annoying URL parameters (?sort=, ?color=), wild pagination (/page/42/), empty categories, and worthless WordPress tags. If that’s the case, don’t panic—just exclude them properly.

Next, isolate the URLs that should be indexed. Product sheets, in-depth articles, SEO landing pages. If they’re on the list, that’s where you need to act: check content quality, correct duplications, strengthen internal linking, add strategic internal backlinks, or simply improve relevance.

What mistakes should be avoided in response to this reporting change?

Big mistake number one: panicking and submitting everything for indexing through the GSC tool. This is pointless. If Google has deemed a URL not relevant enough, forcing it won’t change anything in the long run. At best, it will be temporarily indexed and then re-excluded. At worst, you’ll spam Google with unnecessary requests and degrade your crawl budget.

Big mistake number two: completely ignoring this data. Yes, it’s just reporting. But if you have thousands of discovered non-indexed URLs, it likely signals a structural problem: polluted sitemap, poorly structured hierarchy, massive duplicate content, or failing canonicalization. This is an opportunity to clean up—not to sweep things under the rug.

How to check if my site is managing this indexing filtering well?

Start by cross-referencing Search Console with your server logs. Look at which URLs Googlebot crawls but does not index. If they're useless pages, great. If they're strategic pages, corrections are needed. Next, check your sitemap: remove any URLs you don’t want to be indexed (yes, it sounds stupid, but many just throw everything in).

Then, work on the internal quality and authority of pages you want to index. Strong internal linking, unique and substantial content, clean canonicalization, no accidental noindex. And above all, stop creating URLs like crazy—every additional URL dilutes your crawl budget and authority.

  • Export the list of non-indexed discovered URLs from Search Console
  • Identify the irrelevant URLs (parameters, filters, pagination) and exclude them properly (robots.txt, noindex, canonical)
  • Spot the non-indexed strategic pages and investigate the cause (quality, duplication, weak internal linking)
  • Clean the sitemap: submit only the URLs you genuinely want to index
  • Enhance internal linking and authority of priority pages
  • Monitor the evolution of the volume of non-indexed URLs over several weeks to detect trends
This reporting change isn't a catastrophe, but a signal. If you see a surge in non-indexed URLs, it’s an opportunity to clean up, prioritize your SEO efforts, and understand how Google perceives your site. However, pinpointing exactly why certain strategic pages are excluded may require specialized skills and advanced tools. If you find that important pages aren’t passing or if the extent of cleaning overwhelms you, consulting a specialized SEO agency can save you time and prevent costly mistakes—especially if your site handles thousands of URLs.

❓ Frequently Asked Questions

Ce changement de reporting signifie-t-il que Google indexe moins de pages qu'avant ?
Non. Google affirme que l'indexation elle-même n'a pas changé, seule la visibilité de ces données dans Search Console a évolué. Les URLs découvertes mais non indexées existaient déjà, vous les voyez simplement mieux maintenant.
Dois-je forcer l'indexation des URLs découvertes mais non indexées via l'outil de soumission GSC ?
Non, sauf si vous êtes certain que ces URLs méritent d'être indexées et que vous avez corrigé la cause de leur exclusion. Forcer l'indexation sans corriger le problème sous-jacent ne sert à rien à long terme.
Comment savoir si les URLs non indexées sont vraiment un problème pour mon SEO ?
Analysez la nature de ces URLs. Si ce sont des paramètres, des filtres ou des paginations inutiles, pas de souci. Si ce sont des fiches produits, des articles stratégiques ou des landing pages clés, il faut investiguer et corriger.
Faut-il retirer ces URLs non indexées de mon sitemap ?
Oui, absolument. Un sitemap doit contenir uniquement les URLs que vous voulez indexer. Soumettre des URLs que Google écarte ne fait que polluer votre crawl budget et brouiller les signaux.
Ce changement peut-il impacter mon trafic SEO à court terme ?
Normalement non, puisque Google affirme que l'indexation elle-même n'a pas changé. Mais surveillez vos positions et votre trafic sur les prochaines semaines — certains sites pourraient voir des fluctuations si Google ajuste simultanément ses critères de qualité ou de crawl budget.
🏷 Related Topics
Domain Age & History Crawl & Indexing AI & SEO Domain Name Local Search Search Console

🎥 From the same video 17

Other SEO insights extracted from this same Google Search Central video · duration 37 min · published on 12/06/2020

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.