Why does the site: query display URLs that Google doesn't rank in the SERPs?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

A site: query can sometimes show several versions of a page, even though Google's index only includes one for ranking in the SERPs. Using the info: query can help verify which URL is actually indexed.

5:41

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h00 💬 EN 📅 21/04/2015 ✂ 23 statements

Watch on YouTube (5:41) →

✂ Other statements from this video 22 ▾

📅

Official statement from April 21, 2015 (11 years ago)

⚠ A more recent statement exists on this topic Should You Really Consolidate Pages Ranking for the Same Queries? Gary Illyes · June 13, 2023 View statement →

TL;DR

The site: query sometimes shows multiple versions of the same page, while Google only indexes one for ranking. This difference often creates confusion between what you see in a diagnostic search and what contributes to the ranking. The info: query allows you to check which URL Google considers canonical for search results.

What you need to understand

What is the difference between what site: shows and the ranking index?

The command site:yourdomain.com queries a broader database than the index used to generate traditional SERPs. Google discovers, crawls, and stores many versions of a page: HTTP vs HTTPS, with or without www, with URL parameters, AMP versions, etc.

When you launch a site: query, you are actually looking at the discovery index, not necessarily the final ranking index. Google may display 3, 4, or 5 different versions of the same page in site: results, while only one will be kept for ranking on a regular user query.

How does the info: query actually work?

The command info:exact-URL directly queries the ranking index. If Google returns a different page from the one you submitted, it means the engine has canonicalized the URL to another version.

For example: you type info:yoursite.com/page-a, and Google responds with yoursit.com/page-b. This means that page-b is the canonical URL retained, the one that participates in the ranking. Page-a exists in the discovery index, it may appear in site:, but it will never rank.

Why does this distinction pose a problem for SEOs?

Many SEO audits rely on site: to count indexed pages. The number obtained is often inflated: you see 1200 indexed pages, while only 800 actually contribute to the ranking. The other 400 are duplicates, variations that Google knows but dismisses.

This confusion leads to erroneous diagnostics. You might think you have a massive indexing problem because site: displays hundreds of 'duplicate' URLs, while Google is already managing canonicalization internally. Conversely, you might believe that an important page is indexed because it appears in site:, but it will never rank if it is not in the ranking index.

site: queries the discovery index, which is broad and inclusive
info: queries the ranking index, restricted to canonical URLs
A URL visible in site: is not necessarily the one that will rank in the SERPs
Discrepancies between site: and info: reveal Google's canonicalization choices
An audit based solely on site: overestimates the number of pages actually active for SEO

SEO Expert opinion

Is this explanation consistent with what is observed in the field?

Absolutely. For years, practitioners have noticed massive discrepancies between site: and third-party tools like Screaming Frog or Ahrefs. A manual analysis with info: consistently reveals that Google has canonicalized to a unique version, even if site: shows 4-5 variants.

The problem is that Google has never officially documented this nuance before. Many junior SEOs (and some not-so-junior ones) continue to rely blindly on site: for auditing indexing. The result: client reports full of false positives, unnecessary recommendations to 'deindex duplicates' when Google is already managing them.

What limitations should be placed on the use of info:?

The info: command is not scalable. You cannot check 500 URLs one by one. It is a spot diagnostic tool, useful for resolving doubts about a strategic page, not for auditing an entire site.

Furthermore, info: does not always work consistently. For certain URLs, Google returns a blank or ambiguous response. [To be verified]: the reliability of info: seems to vary based on the freshness of the crawl and the complexity of the site's canonicalization rules. If you have multiple redirect chains or contradictory canonical tags, info: may give unstable results.

In what cases does this distinction become critical?

On e-commerce sites with filters and URL parameters, the difference between the discovery index and the ranking index can be enormous. You may have 50,000 crawlable URLs, site: displays 35,000, but only 8,000 actually rank. If you base your crawl budget or pagination strategy on site: figures, you are allocating resources to ghost URLs.

Another case: multilingual or multi-currency sites. Google may display 6 versions of a product page in site: (fr, en, de, with or without currency parameter), but only one will be canonical for each language. If you don't use info: to validate, you risk optimizing the wrong URLs.

Warning: Do not confuse 'indexed' with 'rankable'. A URL may exist in Google's index (thus appearing in site:) without ever ranking on any query if it is canonicalized to another version. Info: tells you which URL Google considers as the source of truth.

Practical impact and recommendations

How can you concretely check which URL Google uses for ranking?

First step: identify your business-critical strategic pages. For each, launch an info:exact-URL query. If Google returns a different URL, that is the one carrying the ranking. Note the discrepancy and analyze why: redirect 301, canonical tag, URL parameter managed by Google?

Second step: cross-reference with Search Console. The ‘Coverage’ tab > ‘Excluded’ shows you the discovered but not indexed URLs for ranking. Compare this list with the site: results: everything that appears in site: but not in the Search Console is probably only in the discovery index.

What mistakes should you absolutely avoid?

Never base client reporting solely on site:. The number is misleading. If you announce '1200 indexed pages' based on site:, you are overestimating the real inventory. Instead, use the Search Console as the source of truth for rankable URLs.

Another common mistake is to panic over 'duplicates' in site: and ask the developer to add noindex everywhere. Google is already managing canonicalization. Before messing with the code, check with info: if those duplicates actually participate in ranking. Nine times out of ten, they are already excluded.

What audit strategy should you adopt in light of this complexity?

Implement a three-layer audit: (1) complete technical crawl with Screaming Frog to list all crawlable URLs, (2) Search Console extraction for indexed URLs according to Google, (3) strategic sampling with info: on key pages to validate canonicalization.

This approach avoids false positives and gives you a realistic view of the active inventory. You know how many URLs Google knows about (crawl), how many it indexes for ranking (GSC), and which ones it considers canonical (info:). From there, you can prioritize optimizations that will have a real impact on ranking.

These cross-checks require time, rigor, and good command of tools. If your internal team lacks bandwidth or expertise in these technical areas, hiring a specialized SEO agency can save you months of missteps. A professional audit will provide you with an accurate mapping of your actual indexing without the approximations generated by simplistic methods based solely on site:.

Test strategic URLs with info: to identify the canonical version retained by Google
Compare site: results with Search Console data to detect discrepancies
Never communicate an indexing number based solely on site: in a client report
Audit canonicalization chains (canonical tags, redirects) before concluding there is an indexing issue
Use a technical crawl + GSC + info: sampling for a complete view of the inventory
Train internal teams on the distinction between the discovery index and the ranking index

Remember this: site: does not reflect the ranking index, only the discovery index. To check which URL will actually rank, use info: on your key pages and cross-reference with Search Console. This distinction radically changes how you audit indexing and prioritize your technical optimizations.

❓ Frequently Asked Questions

Est-ce que toutes les URL qui apparaissent dans une requête site: sont réellement indexées ?

Non. La requête site: affiche l'index de découverte, qui inclut des URL connues de Google mais non retenues pour le classement. Seule une partie de ces URL participe réellement au ranking dans les SERP.

La commande info: fonctionne-t-elle sur tous les types d'URL ?

Info: fonctionne sur la plupart des URL, mais peut donner des résultats vides ou incohérents sur des pages avec des règles de canonicalisation complexes, des chaînes de redirections multiples, ou un crawl récent incomplet.

Dois-je m'inquiéter si site: affiche plusieurs versions d'une même page ?

Pas nécessairement. Google découvre naturellement plusieurs variantes (HTTP/HTTPS, www/non-www, paramètres URL). Si vous avez correctement configuré canonical tags et redirections, une seule version sera retenue pour le classement. Vérifiez avec info: en cas de doute.

La Search Console affiche-t-elle l'index de découverte ou l'index de classement ?

La Search Console affiche principalement l'index de classement dans l'onglet Couverture > Indexées. Les URL découvertes mais non indexées apparaissent dans Exclues. C'est une source plus fiable que site: pour auditer l'inventaire actif.

Combien de temps faut-il pour qu'une URL passe de l'index de découverte à l'index de classement ?

Ça dépend du crawl budget, de la qualité de la page et de la clarté des signaux de canonicalisation. Une page bien liée avec un canonical propre peut passer en quelques jours ; une page orpheline ou ambiguë peut rester en découverte indéfiniment sans jamais ranker.

🏷 Related Topics

indexation site query info query canonicalisation crawl budget Search Console audit technique SERP

Domain Age & History Crawl & Indexing Featured Snippets & SERP AI & SEO Domain Name

🎥 From the same video 22

Other SEO insights extracted from this same Google Search Central video · duration 1h00 · published on 21/04/2015

🎥 Watch the full video on YouTube →

Related statements

« Previous

Automatically Adjusted Crawl in Case of Server Err...

The Impact of URL Parameters on Indexing...

« Back to results