Official statement
Other statements from this video 27 ▾
- 13:31 Vos pages lentes peuvent-elles plomber le classement de tout votre site ?
- 13:33 Les Core Web Vitals impactent-ils vraiment tout votre site ou seulement vos pages lentes ?
- 13:33 Peut-on bloquer la collecte des Core Web Vitals avec robots.txt ou noindex ?
- 14:54 Pourquoi CrUX collecte vos Core Web Vitals même si vous bloquez Googlebot ?
- 15:50 Page Experience : Google ment-il sur son véritable poids dans le classement ?
- 16:36 L'expérience de page est-elle vraiment un signal de classement secondaire ?
- 17:28 Le LCP mesure-t-il vraiment la vitesse perçue par l'utilisateur ?
- 19:57 Les Core Web Vitals se calculent-ils vraiment pendant toute la navigation ?
- 20:04 Les Core Web Vitals évoluent-ils vraiment après le chargement initial de la page ?
- 22:22 Comment Google estime-t-il les Core Web Vitals d'une page sans données CrUX ?
- 27:07 Comment Google attribue-t-il désormais les données CrUX du cache AMP à l'origine ?
- 29:47 AMP est-il encore nécessaire pour ranker dans Top Stories sur mobile ?
- 32:31 Comment exploiter les logs serveur pour détecter les erreurs 4xx dans Search Console ?
- 34:34 Pourquoi les nouveaux sites connaissent-ils une volatilité extrême dans l'indexation et le classement ?
- 34:34 Faut-il vraiment analyser les logs serveur pour diagnostiquer les erreurs 4xx dans Search Console ?
- 34:34 Pourquoi votre nouveau site fluctue-t-il comme un yoyo dans les SERP ?
- 40:03 Faut-il vraiment signaler le contenu copié de votre site via le formulaire spam de Google ?
- 40:20 Comment signaler efficacement le spam de contenu copié à Google ?
- 43:43 Vos pages franchise sont-elles des doorway pages aux yeux de Google ?
- 45:46 Le contenu dupliqué est-il vraiment sans danger pour votre référencement ?
- 45:46 Le contenu dupliqué est-il vraiment sans pénalité pour votre SEO ?
- 45:46 Vos pages franchises sont-elles perçues comme des doorway pages par Google ?
- 51:52 Le namespace http:// ou https:// dans un sitemap XML influence-t-il vraiment le crawl ?
- 52:00 Le namespace en https dans votre sitemap XML pénalise-t-il votre référencement ?
- 55:56 Faut-il vraiment inclure les deux versions mobile et desktop dans son sitemap XML ?
- 56:00 Faut-il vraiment soumettre les versions mobile ET desktop dans votre sitemap ?
- 61:54 Faut-il abandonner AMP si vous utilisez GA4 pour mesurer vos performances ?
Google confirms that in the absence of sufficient CrUX data for a page, it may rely on scores from similar pages on the same site, or even the overall domain score if the architecture is complex. For an SEO, this means that a little-visited page may inherit the performance of neighboring pages — an estimation mechanism that can either work in your favor or penalize you depending on the overall state of the site. The recommendation: optimize the performance of the entire domain, not just strategic pages.
What you need to understand
What is CrUX and why do some pages lack data?
The Chrome User Experience Report (CrUX) gathers real performance metrics from users' Chrome browsers. When a page receives little traffic, it does not accumulate enough data to generate a statistically reliable report.
Google therefore needs a solution: either ignore these pages, or estimate their score. The statement confirms that similarity estimation is the chosen method, preventing an entire segment of sites — particularly smaller sites or deep pages — from being entirely excluded from the Core Web Vitals ranking.
How does this similarity estimation work?
Google operates in two steps. First, it looks for similar pages on the same domain with usable CrUX data. Similarity is based on structure, template, loaded resources — not on textual content.
If this approach fails because the site architecture is too complex or the patterns too varied, Google then applies the overall domain score. This is a safety net that ensures a signal, even if approximate.
Why does this approach pose a problem for SEO practitioners?
The estimation masks the real disparities. A poorly optimized orphan page with no traffic could inherit a good score if the rest of the site performs well — and vice versa. This complicates auditing: you no longer know if the displayed score reflects reality or a smoothed average.
Another issue: sites with heterogeneous architecture (e-commerce with blog sections, interactive tools, product sheets) risk having radically different pages aggregated under a single estimated score, diluting the signal’s relevance.
- CrUX relies on real field data, so certain pages without traffic have no exploitable history.
- Google utilizes similar pages from the same site to fill in the gaps, prioritizing architecture and resources.
- If the architecture is too complex, the overall domain score is applied by default.
- This estimation method makes it difficult to accurately identify underperforming pages without direct CrUX data.
- Optimization should therefore aim for overall coherence rather than a page-by-page isolated approach.
SEO Expert opinion
Is this statement consistent with field observations?
On paper, yes. Tools like PageSpeed Insights or Search Console have displayed aggregated scores by group of URLs for a long time, indicating this estimation mechanism. The problem is that Google never specifies the data threshold required for a page to be considered “sufficiently documented”.
Indeed, we observe cases where orphan pages inherit scores that do not correspond to their technical reality. But without access to the raw CrUX metrics per page, it is impossible to verify whether the estimation works in your favor or hinders you. [To verify]: Google does not publish either the similarity criteria or the respective weight of different metrics in the estimation algorithm.
What nuances should be added to this statement?
The phrase “similar pages” remains vague. Structural similarity, yes — but to what degree? A product sheet template with 10 different components can generate enormous performance variations depending on the images, third-party scripts, the number of recommended products. Thus, the estimation risks smoothing critical gaps.
A second point: applying the overall site score to a complex architecture means drowning out the specifics. A site with a fast blog and a heavy JavaScript configurator will have an average score that neither reflects one nor the other. For an SEO, this means treating each section as a micro-site with its own target performance.
In what cases does this rule not apply?
If a page has sufficient CrUX data, Google estimates nothing — it uses the real metrics. But the threshold of “insufficiency” is never publicly defined. Empirically, pages with less than a few hundred monthly Chrome visits seem to fall under estimation, but this is a field observation, not an official rule.
Another exception: noindex or crawl-blocked pages do not count in the calculation, even if they generate traffic. Google does not include them in the CrUX reports. Finally, very new sites without sufficient history may not have a score at all for several weeks.
Practical impact and recommendations
What concrete steps should be taken to manage this estimation?
First, map your templates. Identify groups of pages that share the same technical structure: product sheets, blog articles, landing pages, category pages. Each group must have homogeneous performance, as Google will likely treat them as a unit.
Then, concentrate your optimization efforts on the most critical templates — those that generate SEO traffic or represent a significant volume of pages. If your blog performs poorly, all orphan pages from the blog will inherit this bad score by estimation.
What mistakes to avoid in this estimation context?
Do not assume that an invisible page escapes oversight. If it shares a template with well-crawled pages, it contaminates the group’s estimation. As a result: a slow page without traffic can damage the score of an entire cluster.
Another trap: optimizing only high-traffic pages. If the rest of the site is a technical disaster, the overall score will remain mediocre and hinder your strategic pages as a side effect. The silo approach no longer works with this aggregation logic.
How can I check if my site benefits from this estimation logic?
Use the Search Console, “Experience” tab > “Core Web Vitals”. Look at the groups of URLs classified as “Good”, “Needs Improvement”, “Poor”. If you see pages with no traffic displayed in these reports, it means they are benefiting from (or suffering from) the estimation.
Then compare with PageSpeed Insights on specific URLs. If PSI shows “No field data available” but the Search Console still ranks the page, it is the estimation at play. At this stage, deep dive into the technical analysis of the template to understand where loading time is spent.
- Map the page templates and measure the performance of each group using Lighthouse or WebPageTest.
- Prioritize the optimization of strategic page clusters (high volume, high SEO traffic) to avoid contamination by estimation.
- Check in the Search Console if pages without CrUX data still appear in Core Web Vitals reports.
- Monitor the evolution of the overall domain score in CrUX via BigQuery to anticipate aggregation impacts.
- Avoid overly heterogeneous architectures: the more distinct your templates are technically, the blurrier the overall estimation will be.
- Test orphaned or little-visited pages with synthetic tools (Lighthouse) to detect discrepancies between technical reality and displayed estimated score.
❓ Frequently Asked Questions
Peut-on savoir si une page utilise des données CrUX réelles ou estimées ?
Le score global du site peut-il pénaliser une page rapide sans trafic ?
Comment Google définit-il la « similarité » entre pages ?
Un site sans données CrUX du tout est-il pénalisé dans le classement ?
Faut-il optimiser les pages sans trafic si elles n'ont pas de données CrUX ?
🎥 From the same video 27
Other SEO insights extracted from this same Google Search Central video · duration 1h07 · published on 28/01/2021
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.