What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Referring URLs that generate 404 errors are currently not accessible via the Search Console API. To identify these broken links, it is recommended to use a local crawler on your own site.
111:39
🎥 Source video

Extracted from a Google Search Central video

⏱ 985h14 💬 EN 📅 26/02/2021 ✂ 39 statements
Watch on YouTube (111:39) →
Other statements from this video 38
  1. 21:28 Do sitemaps really trigger a quick recrawl of your modified pages?
  2. 21:28 Can you really force Google to recrawl immediately after a price change?
  3. 40:33 Does font size really influence Google rankings?
  4. 40:33 Does CSS font size really impact your positions on Google?
  5. 70:28 Is it true that content concealed behind a Read More button is actually indexed by Google?
  6. 70:28 Is it true that content hidden behind a 'Read More' button is actually indexed by Google?
  7. 98:45 Does internal linking truly overshadow the sitemap in signaling your strategic pages to Google?
  8. 98:45 Is Internal Linking Really More Crucial Than a Sitemap for Prioritizing Your Pages?
  9. 144:15 Why does Google keep crawling 404 URLs that are years old?
  10. 182:01 Should you really be worried about having 30% of URLs as 404s on your site?
  11. 182:01 Can a high 404 rate really hurt your SEO rankings?
  12. 217:15 How can you effectively target multiple countries with a single domain without losing your local SEO?
  13. 217:15 Can you really target different countries on the same domain without using subdomains?
  14. 227:52 Should you really use hreflang when targeting multiple countries with the same language?
  15. 227:52 Should you really combine hreflang and geographical targeting in Search Console?
  16. 276:47 Why do your structured data breadcrumbs not show up in the SERPs?
  17. 285:28 Why do your rich results vanish from the standard SERPs while still appearing in site searches?
  18. 293:25 Do Invisible Breadcrumbs Really Block Your Rich Results on Google?
  19. 325:12 Should you really be optimizing JavaScript hydration for Googlebot in SSR?
  20. 347:05 Is it true that word count doesn't matter for ranking on Google?
  21. 347:05 Is the number of words really a ranking factor for Google?
  22. 400:17 Does the traffic volume of your site affect your Core Web Vitals score?
  23. 415:20 Does traffic volume really influence your Core Web Vitals?
  24. 420:26 Does content relevance truly outweigh Core Web Vitals in Google rankings?
  25. 422:01 Can Core Web Vitals Really Boost Your Ranking Without Relevant Content?
  26. 510:42 Is it true that Google can't always show the right local version of your site?
  27. 529:29 Is it really necessary to duplicate all country codes in hreflang for targeting multiple regions?
  28. 531:48 Why does hreflang in Latin America require each country code individually?
  29. 574:05 Does PageSpeed Insights really measure your site's performance?
  30. 598:16 Is it really possible to shift from long-tail to short-tail without changing strategy?
  31. 616:26 Can you really hide dates from Google search results?
  32. 635:21 Should you stop updating publication dates to boost your SEO?
  33. 649:38 Does Google really rewrite your titles to help you out?
  34. 650:37 Can you really stop Google from rewriting your title tags?
  35. 688:58 Should you really report SERP bugs with generic queries to expect a response from Google?
  36. 870:33 Should new e-commerce sites prove their legitimacy outside of Google first?
  37. 937:08 Is it true that the length of the title really impacts Google rankings?
  38. 940:42 Is it true that the length of title tags really impacts Google's rankings?
📅
Official statement from (5 years ago)
TL;DR

Google confirms that the Search Console API does not expose referring URLs that generate 404 errors, unlike the web interface. To identify these broken links, one must crawl their own site using a local tool — forcing SEOs to maintain a parallel monitoring infrastructure. This limitation complicates the automated detection of toxic backlinks or missing redirects.

What you need to understand

What Are the Differences Between the Web Interface and the Search Console API?

The web interface of the Search Console displays, for each 404 error, the list of referring URLs pointing to the missing page. It's a valuable tool for quickly identifying which internal or external links are broken.

The Search Console API, on the other hand, only returns the error URL — without the list of pages that reference it. This asymmetry between the interface and the API complicates the automation of 404 audits, as the majority of modern SEO workflows rely on scripts and third-party tools that query the API.

Why Does This Limitation Pose An Operational Problem?

A site with 50,000 pages can generate hundreds of 404s each month. Without programmatic access to the referring URLs, it is impossible to automatically prioritize corrections: a 404 that receives 100 internal links deserves more attention than an orphaned page with no referrer.

This gap forces SEO teams to maintain a custom crawler or subscribe to third-party solutions (Screaming Frog, OnCrawl, Botify) to crosscheck the data. The cost in infrastructure and execution time is significant — especially for high-volume sites or agencies managing dozens of clients.

How Does Google Justify This Lack of Data?

Mueller explicitly recommends crawling your own site to identify referrers. This position is consistent with Google's philosophy: the Search Console is a diagnostic tool, not an on-demand crawler.

Let's be honest: Google has no incentive to provide a complete API that would render third-party solutions obsolete. Maintaining this technical limitation forces publishers to invest in external tools — which, incidentally, fuels an entire ecosystem of SEO SaaS.

  • The Search Console API only returns the 404 error URL, not the referring pages.
  • The web interface, however, displays the referrers — but with no option for large-scale automated export.
  • Google advises crawling your site locally to crosscheck this data.
  • This limitation imposes an infrastructure cost: crawling, storage, monitoring scripts.
  • Agencies and high-volume sites must maintain a parallel solution to prioritize corrections.

SEO Expert opinion

Is Google's Position Consistent with Observed Practices?

Absolutely. Google has always segmented data between web interface and API — often for reasons of performance, privacy, or economic model. The Performance API, for example, caps at 25,000 rows per request, whereas the CSV export via the interface can go well beyond.

What Mueller doesn't say: even the web interface sometimes only displays a sample of referrers, especially if a 404 URL receives hundreds of backlinks. Full data is rarely accessible — whether through the API or the interface. [To be verified]: the exact proportion of displayed referrers versus the actual total is documented nowhere.

What Nuances Should Be Added to This Recommendation?

Crawling your own site is relevant advice for internal links — but completely insufficient for external backlinks. A local crawler will never see that a third-party site points to your 404 through a deep link.

For broken backlinks, it's necessary to crosscheck data from the Search Console (even if incomplete) with that from a third-party tool like Ahrefs, Majestic, or Semrush. And again, these tools only see a fraction of the web — Google remains the only one with a comprehensive index. This asymmetry of information is frustrating, but it is structural.

Warning: if you manage a site with many temporary redirects (302, 307), some crawlers may interpret them as 404s if they misfollow the redirection chains. Always check your tool's configuration before making mass corrections.

In What Cases Does This Rule Not Apply?

If your site generates fewer than 50 404 errors per month, the web interface of the Search Console is more than sufficient. You can manually export the list of referrers page by page — it's tedious, but doable.

However, for an e-commerce site with thousands of archived product pages, or a media site that regularly changes its URL architecture, the manual approach becomes impractical. In this case, investing in an automated crawler (cloud or local) becomes worthwhile from the first month.

Practical impact and recommendations

What Concrete Steps Should You Take?

First, install a regular crawler — Screaming Frog locally if you have fewer than 500,000 URLs, OnCrawl or Botify as a SaaS for more. Set up a weekly crawl (at a minimum monthly) to detect internal 404s as soon as they appear.

Next, crosscheck this data with the errors reported by the Search Console (interface or API) to identify the 404s that still receive traffic or impressions. A 404 that generates 0 clicks but 1000 impressions often signals a poorly corrected internal linking issue.

What Mistakes Should Be Avoided in Handling 404s?

Never bulk redirect all 404s to the home or a generic category page. Google detects these soft-404s and treats them as errors — you lose the benefits of redirection without solving the problem.

Also, avoid removing 404s from your XML sitemap without checking that they no longer receive backlinks. An orphaned URL can continue to capture external PageRank for months — abruptly deindexing it is akin to throwing away this link juice.

How to Automate the Prioritization of Corrections?

Create a criticality score for each 404: internal referrer count × external backlink count × SEO traffic over the last 3 months. URLs with a high score deserve a 301 redirect to the semantically closest page.

For 404s without referrers or backlinks, leave them as errors — there's no need to clutter your .htaccess file with hundreds of unnecessary redirects. Google manages legitimate 404s very well; that’s what it's built for.

  • Install a crawler (Screaming Frog, OnCrawl, Botify) and plan a minimum weekly crawl.
  • Crosscheck the crawler's 404s with Search Console data to identify those still receiving traffic.
  • Calculate a criticality score (internal referrers + backlinks + SEO traffic) to prioritize corrections.
  • Only redirect 404s with referrers or backlinks to 301 — never to a generic page.
  • Exclude 404s from your XML sitemap that no longer receive backlinks or traffic.
  • Regularly monitor new 404 errors to correct the internal links at the source.
The absence of 404 referrers in the Search Console API necessitates a parallel monitoring infrastructure. High-volume sites or agencies must invest in a recurring crawler and prioritization scripts. This technical complexity, coupled with the need to crosscheck multiple data sources, often makes it wise to enlist a specialized SEO agency to structure a robust, automated monitoring workflow tailored to your context.

❓ Frequently Asked Questions

L'interface web de la Search Console affiche-t-elle tous les référents d'une erreur 404 ?
Non, l'interface affiche souvent un échantillon, surtout si l'URL reçoit des centaines de backlinks. La donnée complète n'est jamais garantie, ni via l'interface ni via l'API.
Un crawler local peut-il détecter les backlinks externes qui pointent vers mes 404 ?
Non. Un crawler local ne voit que les liens internes de votre site. Pour les backlinks externes, il faut croiser avec un outil tiers comme Ahrefs, Majestic ou Semrush.
Faut-il rediriger systématiquement toutes les erreurs 404 détectées ?
Absolument pas. Une 404 sans référent ni backlink peut rester en erreur — c'est son statut légitime. Redirigez uniquement celles qui captent du PageRank ou génèrent encore du trafic.
Quelle fréquence de crawl est recommandée pour un site e-commerce de 100 000 URLs ?
Au minimum hebdomadaire. Un site qui change régulièrement de catalogue ou d'architecture d'URL devrait crawler tous les 2-3 jours pour détecter les 404 avant qu'elles n'accumulent des référents.
Les soft-404 (redirections vers la home ou une catégorie générique) sont-elles pénalisées par Google ?
Pas pénalisées, mais ignorées : Google les traite comme des erreurs classiques, vous perdez le bénéfice de la redirection sans régler le problème de fond.
🏷 Related Topics
Crawl & Indexing AI & SEO JavaScript & Technical SEO Links & Backlinks Domain Name Local Search Search Console

🎥 From the same video 38

Other SEO insights extracted from this same Google Search Central video · duration 985h14 · published on 26/02/2021

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.