What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

You shouldn't rely on the 'site:' command for the exact count of indexed pages; it's better to use data from Search Console.
22:06
🎥 Source video

Extracted from a Google Search Central video

⏱ 58:40 💬 EN 📅 30/10/2019 ✂ 13 statements
Watch on YouTube (22:06) →
Other statements from this video 12
  1. 2:11 Faut-il optimiser son contenu pour BERT ou est-ce une perte de temps ?
  2. 3:46 YouTube bénéficie-t-il d'un avantage SEO dans Google Search ?
  3. 6:09 Problèmes d'indexation qui traînent : bug Google ou faille technique de votre site ?
  4. 8:54 Comment Google comptabilise-t-il vraiment les impressions dans Search Console ?
  5. 11:36 Faut-il vraiment implémenter hreflang sur tous les sites multilingues ?
  6. 18:42 Peut-on vraiment tricher avec les données structurées pour obtenir des rich snippets ?
  7. 28:38 Les pages non mobile-friendly peuvent-elles vraiment survivre à l'indexation mobile-first ?
  8. 35:51 Le budget de crawl se gère-t-il vraiment au niveau du serveur et non du dossier ?
  9. 43:40 Faut-il bloquer les URL paramétrées en robots.txt ou via les réglages Search Console ?
  10. 49:39 Faut-il vraiment « réparer » une pénalité algorithmique pour retrouver son trafic ?
  11. 61:48 Les sitemaps accélèrent-ils vraiment l'indexation des actualités sur Google ?
  12. 69:08 Le contenu réutilisé dans les sites d'actualités : quelle est vraiment la limite avant la pénalité ?
📅
Official statement from (6 years ago)
TL;DR

John Mueller states that the site: command only provides a rough estimate of the number of indexed pages, not an exact count. For rigorous monitoring, Search Console remains the go-to tool with its structured and historical data. In practical terms, stop panicking if the site: numbers fluctuate — they only reflect an imprecise snapshot at any given moment.

What you need to understand

Why is the site: command inaccurate?

The site: command queries a partial and volatile index. Google does not guarantee completeness in these results, which vary based on the queried datacenter, the time of day, and the state of index refreshment. You might see discrepancies of 20 to 40% between two queries spaced a few hours apart — without any pages being added or removed.

This ambiguity is explained by Google's distributed architecture: each datacenter maintains a slightly different version of the index, optimized for search performance, not for accounting. The site: command was never designed as a monitoring tool, but as a quick search filter for users.

What does Search Console offer instead?

The Coverage report (now integrated into Pages) provides a stabilized count of Google's known URLs: indexed, excluded, with errors. This data is aggregated across multiple datacenters and updated with a delay of 24 to 48 hours, making it reliable for tracking over time.

The GSC explicitly distinguishes between indexed pages and discovered but non-indexed pages, offering granularity impossible via site:. You can cross-check these numbers with server logs to identify crawled but ignored URLs, or orphaned pages indexed by accident.

In what cases is site: still relevant?

Let's be honest: site: still has its utility for occasional checks. Are you trying to see if a freshly published page has entered the index? A site:domain.com/exact-url gives you an immediate answer. Are you auditing a competitor and don't have access to their GSC? The site: becomes your only leverage.

The problem arises when it becomes a performance indicator. Comparing "1,245 results" yesterday and "1,198 results" today means nothing — it's noise, not signal. For any longitudinal analysis, the GSC crushes site: without contest.

  • site: provides a volatile estimate, never an exact count
  • The Search Console aggregates stabilized data over 24-48 hours
  • The discrepancies between datacenters explain the fluctuations observed with site:
  • Use site: for occasional checks, not for monitoring
  • Cross-check GSC data with your server logs for a complete view

SEO Expert opinion

Is this statement consistent with field observations?

Absolutely. Any SEO who has monitored site: on a large site has seen the numbers yo-yo for no apparent reason. Fluctuations of 15-30% within a few hours are not uncommon, especially on domains with hundreds of thousands of pages. These variations have no connection to a real change in indexing — it's just the artifact of a command querying a non-unified index.

What’s more interesting is that Mueller does not say "site: is useless". He says, "don’t rely on its count for precise tracking." There's a nuance. For a quick audit, a presence check, or a rough estimate on a competitor, site: remains a valid reflex. The trap is making it a dashboard metric.

What nuances should be added?

The GSC itself is not an absolute truth. Data comes in with 24 to 72 hours of latency depending on properties, and some URLs may be marked as “indexed” although they never show up in search — a phenomenon of secondary indexing or post-indexing filtering. [To be verified]: Google never clarifies the difference between “indexed” and “eligible to rank.”

Another point: the GSC shows the URLs that Google has chosen to show you. On massive sites with duplicate content or infinite facets, you may have indexed pages that appear neither in GSC nor in site: because they are considered silent duplicates. Always cross-check with logs to capture the real crawl.

In what contexts does this rule not apply?

If you manage a site with fewer than 500 pages, site: may suffice for rough weekly monitoring — the gap will often be less than 10 pages, which remains readable. On a stable site without daily publication, site: variations smooth out over time.

However, once you exceed 10,000 URLs, or your site generates dynamic content (e-commerce, aggregators, UGC), site: becomes unusable. The numbers vary too much, and you lose all ability to detect a real problem — partial de-indexation, penalties, canonicalization bugs.

Warning: Never base a client report or a strategic decision on site: figures. A client who sees “1,200 indexed pages” on Monday and “950” on Wednesday is going to panic, even though nothing has happened. You lose credibility and create unnecessary noise.

Practical impact and recommendations

What should you do to track your indexing?

Set up Search Console on all versions of your domain (www, non-www, HTTPS) and consolidate them into a domain property if possible. Export the data from the Pages report weekly to build a reliable history — GSC only keeps 16 months of data.

Complement this with a server log analysis via Oncrawl, Botify, or a custom script. Cross-check the URLs crawled by Googlebot with those reported as indexed in GSC. You will often discover pages that are crawled daily but never indexed, or URLs that have been indexed but not crawled for months — a sign of a frozen index.

What mistakes should be avoided in indexing tracking?

Never compare site: figures between two dates to diagnose a problem. If you must use site:, do it for occasional checks: “Is this page present?”, “How many URLs with this parameter are indexed?”. Never for longitudinal tracking.

Also, don’t panic over weekly fluctuations in GSC. Google constantly recrawls and reevaluates — a drop of 2-5% from one week to the next is statistical noise, not an alarm signal. Instead, monitor trends over 4 to 6 weeks.

How can you ensure no strategic page is missing?

Create a list of your critical URLs (key product pages, campaign landing pages, pillar articles) and check their presence in GSC manually each month. A simple filter on the exact URL in the Pages report will tell you if it is indexed, excluded, or unknown.

If a critical page does not appear, force a crawl via the URL inspection tool and analyze the verdict: blocked by robots.txt? Canonicalized elsewhere? Accidental noindex? The GSC gives you the precise diagnosis — site: leaves you in the dark.

  • Consolidate your GSC properties into a domain property
  • Export the Pages report data weekly for long-term history
  • Cross-check GSC and server logs to detect inconsistencies
  • Never compare two site: queries to diagnose a problem
  • Monitor trends over 4-6 weeks, not weekly variations
  • List your critical URLs and manually check their indexing each month
Rigorous indexing tracking requires a complete technical stack: GSC for the official view, logs for real crawl, and automated monitoring to alert on deviations. These setups demand specialized expertise and ongoing maintenance. If your team lacks internal resources, engaging a specialized SEO agency may be wise — they will structure a monitoring process suited to your scale and respond quickly if anomalies are detected.

❓ Frequently Asked Questions

Pourquoi les résultats de la commande site: changent-ils constamment ?
La commande site: interroge des datacenters différents à chaque requête, chacun maintenant une version légèrement différente de l'index. Les écarts de 20-40 % entre deux requêtes espacées de quelques heures sont normaux et ne reflètent aucun changement réel d'indexation.
La Search Console montre-t-elle toutes les pages indexées par Google ?
La GSC affiche les URLs que Google considère comme principales et éligibles au classement. Certaines pages indexées en tant que doublons ou dans un index secondaire peuvent ne pas apparaître. Croisez avec les logs serveur pour une vue exhaustive.
Puis-je encore utiliser site: pour vérifier l'indexation d'une page précise ?
Oui, site: reste pertinent pour des vérifications ponctuelles (présence d'une URL exacte, estimation rapide sur un concurrent). Évitez simplement d'en faire un indicateur de performance ou de comparer ses chiffres entre deux dates.
Combien de temps faut-il à la Search Console pour refléter un changement d'indexation ?
Les données GSC ont une latence de 24 à 72 heures selon les propriétés. Pour un suivi en temps réel, utilisez l'outil d'inspection d'URL qui interroge l'index actif immédiatement.
Comment expliquer une baisse soudaine du nombre de pages indexées en GSC ?
Vérifiez d'abord les exclusions dans le rapport Pages : canonicalisation, noindex accidentel, soft 404, contenu dupliqué. Si rien n'apparaît, croisez avec les logs pour détecter un éventuel blocage robots.txt ou une chute du crawl budget.
🏷 Related Topics
Domain Age & History Crawl & Indexing Search Console

🎥 From the same video 12

Other SEO insights extracted from this same Google Search Central video · duration 58 min · published on 30/10/2019

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.