Official statement
Other statements from this video 27 ▾
- 13:31 Can your slow pages drag down the rankings of your entire site?
- 13:33 Do Core Web Vitals really affect your entire site or just your slow pages?
- 13:33 Can you really block the collection of Core Web Vitals using robots.txt or noindex?
- 15:50 Does Google really underplay the true importance of Page Experience in rankings?
- 16:36 Is Page Experience really just a secondary ranking signal?
- 17:28 Does LCP truly measure the speed perceived by the user?
- 19:57 Do Core Web Vitals really measure continuously throughout the user session?
- 20:04 Do Core Web Vitals really change after the initial page load?
- 21:22 How does Google estimate your Core Web Vitals when CrUX data is lacking?
- 22:22 How does Google estimate a page's Core Web Vitals without sufficient CrUX data?
- 27:07 How does Google now assign AMP cache's CrUX data to the origin?
- 29:47 Is AMP still necessary to rank in Top Stories on mobile?
- 32:31 How can you leverage server logs to uncover 4xx errors in Search Console?
- 34:34 Why do new sites experience extreme volatility in indexing and ranking?
- 34:34 Should you really analyze server logs to diagnose 4xx errors in Search Console?
- 34:34 Why does your new site fluctuate like a yo-yo in the SERPs?
- 40:03 Should you really report copied content from your site using Google's spam form?
- 40:20 How can you effectively report copied content spam to Google?
- 43:43 Are your franchise pages considered doorway pages by Google?
- 45:46 Is duplicate content really harmless to your SEO?
- 45:46 Is it true that duplicate content won't penalize your SEO?
- 45:46 Are your franchise pages seen as doorway pages by Google?
- 51:52 Does the http:// or https:// namespace in an XML sitemap really affect crawlability?
- 52:00 Does using HTTPS for your XML sitemap namespace hurt your SEO ranking?
- 55:56 Is it really sufficient to include only one version, mobile or desktop, in your XML sitemap?
- 56:00 Should you really submit both mobile AND desktop versions in your sitemap?
- 61:54 Should you give up on AMP if you’re using GA4 to measure your performance?
Google confirms that the data from the Chrome User Experience Report (CrUX) powering the Core Web Vitals comes from real users' Chrome browsers, not from Googlebot. As a result, even if you completely block the bot through robots.txt or noindex, your CWV metrics continue to be collected and utilized for ranking. For an SEO, this means you can't escape the Core Web Vitals by manipulating crawl — the only way out is to have zero Chrome traffic.
What you need to understand
Where do CrUX data actually come from?
The CrUX data do not come from Googlebot activity, but directly from Chrome browsers used by real users. When a Chrome user visits your site (with the statistics sharing option enabled), the browser automatically reports performance metrics: Largest Contentful Paint, First Input Delay, Cumulative Layout Shift.
This data is anonymously aggregated and forms the basis of the Chrome User Experience Report. Google then uses this dataset to feed the Core Web Vitals that impact search ranking. There is no connection to crawling: if your site receives Chrome traffic, it generates CrUX data, period.
What happens if I block Googlebot via robots.txt or noindex?
Blocking Googlebot via robots.txt prevents the bot from crawling your pages. Adding a noindex tag prevents indexing. But neither stops CrUX collection, as it relies on real visitors using Chrome.
In practical terms: even if your page is not crawled or indexed, as long as it receives Chrome traffic, the performance metrics continue to feed into CrUX. Therefore, Google can have Core Web Vitals data for a non-indexed URL — an apparent paradox, but a consistent technical logic.
Why did Google consider it important to clarify this point?
Because many SEO practitioners still confuse data collection and crawling activity. Some believed they could escape the Core Web Vitals by blocking the bot, thinking that without crawling, there would be no metrics.
This statement puts an end to that: the Core Web Vitals are a user-based signal, not a bot-based signal. Google reminds us that CrUX is a ground measurement device, independent of crawl infrastructure. The confusion also stemmed from the fact that other Search metrics (freshness, links, content) do depend on crawling.
- CrUX collects data via Chrome, not via Googlebot
- A robots.txt or noindex block does not prevent the collection of Core Web Vitals
- CWV metrics can exist for non-indexed pages if they receive Chrome traffic
- The only way to escape CrUX is no Chrome traffic or users who have disabled statistics sharing
- This architecture clearly separates user signals from crawl signals in Google's algorithm
SEO Expert opinion
Is this statement consistent with what we see in practice?
Absolutely. SEOs tracking their PageSpeed Insights or Search Console (Core Web Vitals report) have noticed that pages blocked from crawling sometimes display CrUX metrics. This seemed strange two years ago — today, it's clear.
Google has always emphasized that the Core Web Vitals measure real experiences, not a bot simulation. This statement confirms what was suspected: CrUX is a completely separate pipeline from crawling. The two systems do not communicate for collection — they only intersect at the time of ranking.
Are there nuances or edge cases to be aware of?
First nuance: for a URL to show up in CrUX, it requires a sufficient volume of Chrome traffic. Google applies popularity thresholds to protect anonymity. If your page receives only a few visits per month, it won't appear in the public CrUX dataset, even if technically the data is being collected.
Second nuance: Chrome users can disable sharing of usage statistics. In this case, their visits do not contribute to CrUX. The exact opt-out rate is not public [To be verified], but it's believed to remain a minority. The majority of default Chrome installations send data.
Third point: this statement pertains to CrUX and the Core Web Vitals. It says nothing about other metrics or signals that Google might correlate with crawling. In other words, blocking Googlebot still has an impact (indexing, content discovery, freshness), but not on CWV.
What is the real strategic consequence for an SEO?
Let's be honest: this clarification crushes any ambition to bypass the Core Web Vitals through robots.txt maneuvering. Some sites considered temporarily blocking crawl on slow sections, hoping to escape CWV penalties. No chance.
On the other hand, it raises an interesting question for restricted access sites (paywalls, members, intranets). If these pages are not indexed but generate Chrome traffic, their CrUX metrics exist somewhere on Google's servers. Does Google use them for the ranking of a non-indexed URL? No, since the URL is not in the index. But if the page becomes public and indexable later, the historical CrUX data could theoretically be mobilized immediately — [To be verified] with real-world testing, Google has never detailed this point.
Practical impact and recommendations
What should you do concretely if you want to optimize your Core Web Vitals?
First rule: forget about robots.txt band-aids. You won't bypass CrUX by manipulating crawl. The only method is to actually improve user experience: reduce resource weight, optimize rendering, limit blocking third-party scripts, stabilize layout.
Second rule: monitor Search Console (Core Web Vitals report) and PageSpeed Insights in "Field Data" mode. These two tools display real CrUX metrics, meaning those that Google uses for ranking. The "lab" data from Lighthouse is useful for diagnosis, but does not reflect what your Chrome visitors experience.
What mistakes should be avoided in managing CrUX?
Classic mistake: optimizing only for Lighthouse and ignoring CrUX data. Lighthouse simulates a mid-range mobile on a 4G connection. Your real users may have poor connections, old Androids, or conversely high-speed fiber and recent machines. CrUX reflects this diversity — Lighthouse does not.
Another mistake: believing a noindexed page generates no data. It does generate data, but Google does not use it for ranking since the page is not in the index. If you remove the noindex later, the accumulated CrUX metrics could (conditional, Google has never officially confirmed) be quickly considered.
Finally, do not neglect non-Chrome traffic. CrUX only captures Chrome, so if 40% of your audience is on Firefox or Safari, their metrics do not get reported. You can have excellent CWV measured by CrUX and a degraded experience on other browsers. Use RUM (Real User Monitoring) tools to cover the entire spectrum.
How can you check that your site is measured correctly by CrUX?
Go to PageSpeed Insights, enter your URL. If the "Field Data" section shows up with LCP, FID, CLS metrics, it means your page has enough Chrome traffic to be in CrUX. If you see "No data available," either your traffic is too low, or your page is too recent.
You can also directly query the CrUX API or consult the public BigQuery dataset for finer analyses (percentile distribution, temporal evolution). Search Console aggregates data at the site level and by URL groups, making large-scale diagnostics easier.
- Check your CrUX metrics through Search Console and PageSpeed Insights (field data)
- Optimize actual performance: reduce weight, defer non-critical scripts, stabilize layout
- Do not rely on robots.txt or noindex to escape CrUX collection
- Use RUM tools to also monitor non-Chrome browsers
- Test your pages on varied connections and devices, not just under Lighthouse conditions
- If you remove a noindex, immediately monitor CWV — CrUX data may already exist
❓ Frequently Asked Questions
Est-ce que bloquer Googlebot via robots.txt empêche la collecte des données CrUX ?
Une page en noindex peut-elle générer des données CrUX ?
Quel volume de trafic Chrome faut-il pour apparaître dans CrUX ?
Les données CrUX couvrent-elles tous les navigateurs ?
Si je lève un noindex, les données CrUX historiques sont-elles immédiatement exploitées pour le ranking ?
🎥 From the same video 27
Other SEO insights extracted from this same Google Search Central video · duration 1h07 · published on 28/01/2021
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.