Why Doesn't the Search Console API Show Referring URLs for 404 Errors?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Referring URLs that generate 404 errors are currently not accessible via the Search Console API. To identify these broken links, it is recommended to use a local crawler on your own site.

111:39

🎥 Source video

Extracted from a Google Search Central video

⏱ 985h14 💬 EN 📅 26/02/2021 ✂ 39 statements

Watch on YouTube (111:39) →

✂ Other statements from this video 38 ▾

📅

Official statement from February 26, 2021 (5 years ago)

⚠ A more recent statement exists on this topic Is Google Search Console API Really Slowing Down or Is This Normal? John Mueller · October 21, 2025 View statement →

TL;DR

Google confirms that the Search Console API does not expose referring URLs that generate 404 errors, unlike the web interface. To identify these broken links, one must crawl their own site using a local tool — forcing SEOs to maintain a parallel monitoring infrastructure. This limitation complicates the automated detection of toxic backlinks or missing redirects.

What you need to understand

What Are the Differences Between the Web Interface and the Search Console API?

The web interface of the Search Console displays, for each 404 error, the list of referring URLs pointing to the missing page. It's a valuable tool for quickly identifying which internal or external links are broken.

The Search Console API, on the other hand, only returns the error URL — without the list of pages that reference it. This asymmetry between the interface and the API complicates the automation of 404 audits, as the majority of modern SEO workflows rely on scripts and third-party tools that query the API.

Why Does This Limitation Pose An Operational Problem?

A site with 50,000 pages can generate hundreds of 404s each month. Without programmatic access to the referring URLs, it is impossible to automatically prioritize corrections: a 404 that receives 100 internal links deserves more attention than an orphaned page with no referrer.

This gap forces SEO teams to maintain a custom crawler or subscribe to third-party solutions (Screaming Frog, OnCrawl, Botify) to crosscheck the data. The cost in infrastructure and execution time is significant — especially for high-volume sites or agencies managing dozens of clients.

How Does Google Justify This Lack of Data?

Mueller explicitly recommends crawling your own site to identify referrers. This position is consistent with Google's philosophy: the Search Console is a diagnostic tool, not an on-demand crawler.

Let's be honest: Google has no incentive to provide a complete API that would render third-party solutions obsolete. Maintaining this technical limitation forces publishers to invest in external tools — which, incidentally, fuels an entire ecosystem of SEO SaaS.

The Search Console API only returns the 404 error URL, not the referring pages.
The web interface, however, displays the referrers — but with no option for large-scale automated export.
Google advises crawling your site locally to crosscheck this data.
This limitation imposes an infrastructure cost: crawling, storage, monitoring scripts.
Agencies and high-volume sites must maintain a parallel solution to prioritize corrections.

SEO Expert opinion

Is Google's Position Consistent with Observed Practices?

Absolutely. Google has always segmented data between web interface and API — often for reasons of performance, privacy, or economic model. The Performance API, for example, caps at 25,000 rows per request, whereas the CSV export via the interface can go well beyond.

What Mueller doesn't say: even the web interface sometimes only displays a sample of referrers, especially if a 404 URL receives hundreds of backlinks. Full data is rarely accessible — whether through the API or the interface. [To be verified]: the exact proportion of displayed referrers versus the actual total is documented nowhere.

What Nuances Should Be Added to This Recommendation?

Crawling your own site is relevant advice for internal links — but completely insufficient for external backlinks. A local crawler will never see that a third-party site points to your 404 through a deep link.

For broken backlinks, it's necessary to crosscheck data from the Search Console (even if incomplete) with that from a third-party tool like Ahrefs, Majestic, or Semrush. And again, these tools only see a fraction of the web — Google remains the only one with a comprehensive index. This asymmetry of information is frustrating, but it is structural.

Warning: if you manage a site with many temporary redirects (302, 307), some crawlers may interpret them as 404s if they misfollow the redirection chains. Always check your tool's configuration before making mass corrections.

In What Cases Does This Rule Not Apply?

If your site generates fewer than 50 404 errors per month, the web interface of the Search Console is more than sufficient. You can manually export the list of referrers page by page — it's tedious, but doable.

However, for an e-commerce site with thousands of archived product pages, or a media site that regularly changes its URL architecture, the manual approach becomes impractical. In this case, investing in an automated crawler (cloud or local) becomes worthwhile from the first month.

Practical impact and recommendations

What Concrete Steps Should You Take?

First, install a regular crawler — Screaming Frog locally if you have fewer than 500,000 URLs, OnCrawl or Botify as a SaaS for more. Set up a weekly crawl (at a minimum monthly) to detect internal 404s as soon as they appear.

Next, crosscheck this data with the errors reported by the Search Console (interface or API) to identify the 404s that still receive traffic or impressions. A 404 that generates 0 clicks but 1000 impressions often signals a poorly corrected internal linking issue.

What Mistakes Should Be Avoided in Handling 404s?

Never bulk redirect all 404s to the home or a generic category page. Google detects these soft-404s and treats them as errors — you lose the benefits of redirection without solving the problem.

Also, avoid removing 404s from your XML sitemap without checking that they no longer receive backlinks. An orphaned URL can continue to capture external PageRank for months — abruptly deindexing it is akin to throwing away this link juice.

How to Automate the Prioritization of Corrections?

Create a criticality score for each 404: internal referrer count × external backlink count × SEO traffic over the last 3 months. URLs with a high score deserve a 301 redirect to the semantically closest page.

For 404s without referrers or backlinks, leave them as errors — there's no need to clutter your .htaccess file with hundreds of unnecessary redirects. Google manages legitimate 404s very well; that’s what it's built for.

Install a crawler (Screaming Frog, OnCrawl, Botify) and plan a minimum weekly crawl.
Crosscheck the crawler's 404s with Search Console data to identify those still receiving traffic.
Calculate a criticality score (internal referrers + backlinks + SEO traffic) to prioritize corrections.
Only redirect 404s with referrers or backlinks to 301 — never to a generic page.
Exclude 404s from your XML sitemap that no longer receive backlinks or traffic.
Regularly monitor new 404 errors to correct the internal links at the source.

The absence of 404 referrers in the Search Console API necessitates a parallel monitoring infrastructure. High-volume sites or agencies must invest in a recurring crawler and prioritization scripts. This technical complexity, coupled with the need to crosscheck multiple data sources, often makes it wise to enlist a specialized SEO agency to structure a robust, automated monitoring workflow tailored to your context.

❓ Frequently Asked Questions

L'interface web de la Search Console affiche-t-elle tous les référents d'une erreur 404 ?

Non, l'interface affiche souvent un échantillon, surtout si l'URL reçoit des centaines de backlinks. La donnée complète n'est jamais garantie, ni via l'interface ni via l'API.

Un crawler local peut-il détecter les backlinks externes qui pointent vers mes 404 ?

Non. Un crawler local ne voit que les liens internes de votre site. Pour les backlinks externes, il faut croiser avec un outil tiers comme Ahrefs, Majestic ou Semrush.

Faut-il rediriger systématiquement toutes les erreurs 404 détectées ?

Absolument pas. Une 404 sans référent ni backlink peut rester en erreur — c'est son statut légitime. Redirigez uniquement celles qui captent du PageRank ou génèrent encore du trafic.

Quelle fréquence de crawl est recommandée pour un site e-commerce de 100 000 URLs ?

Au minimum hebdomadaire. Un site qui change régulièrement de catalogue ou d'architecture d'URL devrait crawler tous les 2-3 jours pour détecter les 404 avant qu'elles n'accumulent des référents.

Les soft-404 (redirections vers la home ou une catégorie générique) sont-elles pénalisées par Google ?

Pas pénalisées, mais ignorées : Google les traite comme des erreurs classiques, vous perdez le bénéfice de la redirection sans régler le problème de fond.

🏷 Related Topics

erreurs 404 Search Console API Google crawl local backlinks cassés redirections 301 audit technique maillage interne

Crawl & Indexing AI & SEO JavaScript & Technical SEO Links & Backlinks Domain Name Local Search Search Console

🎥 From the same video 38

Other SEO insights extracted from this same Google Search Central video · duration 985h14 · published on 26/02/2021

🎥 Watch the full video on YouTube →

Related statements

« Previous

Relevance Remains More Important Than Core Web Vit...

Google remembers and retries old URLs for years...

« Back to results