Official statement
Other statements from this video 8 ▾
- 2:08 Faut-il vraiment découper vos sitemaps pour gérer un site à fort volume d'URLs ?
- 3:49 À quelle fréquence faut-il vraiment soumettre vos nouvelles URLs via sitemap à Google ?
- 4:21 Comment l'en-tête Unavailable After améliore-t-il le désindexation du contenu périssable ?
- 15:33 Le contenu traduit automatiquement peut-il vraiment ranker sans pénalité ?
- 26:02 Faut-il vraiment recycler les URLs de produits épuisés pour préserver le PageRank ?
- 28:26 Le balisage Schema.org améliore-t-il vraiment le référencement naturel ?
- 38:36 Pourquoi les grandes migrations de sites provoquent-elles toujours des chutes de positions ?
- 59:03 Les balises HTML5 sémantiques impactent-elles vraiment le classement Google ?
Google confirms that the figures displayed in Search Console do not always exactly match the data extracted via the API. The reason: distinct aggregation methods and privacy filtering applied differently depending on the source. For an SEO professional automating reports or cross-referencing sources, this is a crucial point to document for clients to avoid misunderstandings about click or impression volumes.
What you need to understand
What causes this discrepancy between the interface and the API?
Google uses two distinct processing pipelines to display data in the Search Console interface on one side, and to expose it via the API on the other. The interface aggregates data in near real-time with rounding, privacy thresholds, and display optimizations. The API, on the other hand, relies on on-demand queries that trigger their own aggregation calculations.
The privacy filtering plays a central role. When the search volume for a query is too low (generally below a few daily units), Google masks or merges rows to protect user anonymity. This threshold is not applied identically in the interface (which displays aggregated totals) and in the API (which may return filtered or omitted rows).
What do these discrepancies look like in practice?
Typically, you observe differences of a few percent in total clicks or impressions, rarely exceeding 5%. The discrepancies widen further when filtering by query or page — particularly on long-tail queries where privacy filtering is applied row by row.
A common scenario: you export 1,000 query rows via the API, sum the clicks, and obtain 4,820 clicks. In the meantime, the Search Console interface displays 4,987 clicks for the same period and the same site. The 167 missing clicks correspond to queries filtered by the API but aggregated in the interface.
Is either number more "true" than the other?
No. Both are partial views of reality. The interface provides a consolidated total closer to actual traffic (but less granular), while the API offers maximum granularity (but with some rows deleted). Neither provides access to the raw, unfiltered dataset that Google keeps internally.
If your goal is to track trends over time, both sources are reliable as long as you remain consistent. If you want an absolute number to validate an advertising budget or client contract, prefer the interface — and specify this in your reports.
- Two distinct pipelines: the interface and the API do not share exactly the same aggregation logic
- Privacy filtering: low volumes are masked differently depending on the source
- Typical discrepancies: a few percent on totals, more pronounced on long tails
- Neither is false: they are complementary partial views of the same dataset
- Prefer the interface for totals, the API for granularity and automation
SEO Expert opinion
Does this explanation really hold water?
Yes — and it is consistent with what has been observed in the field for years. Teams that automate their reports via the API consistently end up with totals lower than those from the interface. This is not a bug; it’s a design choice.
What is missing from Mueller's statement is a precise quantification of the acceptable discrepancy. Google provides no official threshold like "expect ±3%" or "if it’s more than 10%, contact support." Hence, it’s difficult to know if a 15% discrepancy is part of normal filtering or a malfunction. [To be verified] on a case-by-case basis.
Why doesn't Google synchronize the two sources?
Because it would be technically costly and not necessarily useful. The interface is designed for human reading, providing a quick overview. The API is designed for querying by scripts that require granularity. Synchronizing both would require maintaining a single, ultra-heavy pipeline, leading to degraded response times.
The other reason, less publicly stated: by leaving a methodological gray area, Google maintains room to adjust its filtering algorithms without having to publicly document every change. If you cross-reference Search Console, Analytics, and server logs, you’ll find that all three tell three different stories — and that’s intentional.
In what situations do these discrepancies become problematic?
When you bill a client based on SEO results measured in organic clicks. If your dashboard automated via API shows 12,000 clicks and the client sees the interface showing 12,800, you could come off as either amateurish or fraudulent — even though the discrepancy is structural.
Another case: when you cross-reference Search Console with Google Analytics or server logs. The three sources don’t count the same events (GSC counts indexed impressions, GA counts sessions with JavaScript enabled, logs record all HTTP requests). Stacking non-aligned sources without documenting their biases is the best way to make decisions based on flawed data.
Practical impact and recommendations
What should you do concretely in your reporting?
First, choose a reference source and stick to it. If you utilize the API for automation, specify in all your reports that the figures come from the API and may differ from the interface by a few percent. Add an explicit footnote in your dashboards, such as: "Data extracted via Search Console API — totals may slightly differ from the interface due to privacy filtering applied by Google."
Next, systematically cross-reference with at least one other source (Analytics, server logs, or a custom crawler) to detect anomalies. If the discrepancy between the API and interface exceeds 10%, investigate: it may be a severe filtering issue on a segment of queries or a bug in your processing chain.
What mistakes should you absolutely avoid?
Never mix sources in the same calculation. If you take the total clicks from the interface and divide it by a total of impressions extracted via API, your CTR will be incorrect. Stay consistent: one source for one metric, from end to end.
Another classic mistake: assuming that the discrepancy is constant over time. The privacy filtering intensifies as your site gains long-tail queries, or when Google tightens its thresholds (which happens without prior notice). A 2% discrepancy one month may rise to 8% the following month if you’ve tripled your ultra-niche queries.
How do you verify that your setup is correct?
Conduct a monthly reconciliation audit: export the totals via the interface, then via the API, and compare. If the discrepancy remains stable (say between 3% and 7%), you are within the norm. If the gap jumps or reverses, you probably have a configuration issue (incorrect date filtering, mixed property, or expired API token).
Also document the filtered volumes: check how many rows the API returns for 1,000 requested rows. If you only receive 600 rows, that means 40% of your queries are below the privacy threshold — a useful indicator to assess the actual granularity of your data.
- Choose a reference source (interface or API) and document it in all your reports
- Add an explanatory note on potential discrepancies related to privacy filtering
- Never mix sources in the same calculation (clicks, impressions, CTR)
- Conduct a monthly reconciliation audit to detect anomalies
- If the gap exceeds 10%, investigate: it might be a bug or a filtering change
- Cross-reference with Analytics or server logs to validate overall consistency
❓ Frequently Asked Questions
Quel est l'écart typique entre les données Search Console interface et API ?
Pourquoi l'API renvoie-t-elle parfois moins de lignes que prévu ?
Dois-je privilégier l'interface ou l'API pour mes reportings clients ?
Ces écarts peuvent-ils impacter mes calculs de ROI SEO ?
Comment expliquer ces divergences à un client non technique ?
🎥 From the same video 8
Other SEO insights extracted from this same Google Search Central video · duration 55 min · published on 18/02/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.