How can you effectively manage 404 errors in a SPA without risking deindexation?

Official statement

For SPAs with client-side routing, Google recommends three solutions to signal a 404 error: redirect to a server URL with a 404 code, add a noindex tag, or use a soft 404 (automatically detected but less reliable). The first two methods avoid any risk of undesirable indexing.

7:01

🎥 Source video

Extracted from a Google Search Central video

⏱ 38:29 💬 EN 📅 18/05/2020 ✂ 10 statements

Watch on YouTube (7:01) →

✂ Other statements from this video 9 ▾

1:06 Le dynamic rendering est-il vraiment sans risque pour le SEO ?
1:38 Le dynamic rendering ralentit-il vraiment votre serveur ou améliore-t-il le crawl budget ?
2:39 Pourquoi Google traite-t-il les redirections JavaScript comme des 302 et non des 301 ?
2:39 Google fait-il vraiment une différence entre redirections 301 et 302 pour le SEO ?
3:42 Googlebot peut-il vraiment crawler les liens cachés dans un menu hamburger ?
5:46 Faut-il servir des pages allégées aux bots pour améliorer les performances ?
14:57 Pourquoi Googlebot rate-t-il vos contenus chargés par Web Workers ?
30:51 Le contenu masqué dans les accordéons est-il vraiment indexé par Google ?
31:49 Faut-il vraiment abandonner l'implémentation manuelle du structured data ?

What you need to understand

What specific problems do SPAs pose with 404 errors?

Single Page Applications (SPAs) manage navigation on the client side via JavaScript, without fully reloading the page. When a user accesses a non-existent route, the server often returns a 200 HTTP code with the app's shell, and then JavaScript detects the error and displays a client-side message.

For Googlebot, this is a real puzzle. The crawler sees a page that responds with 200 OK, and it may potentially index it, even if the displayed content says 'Page not found.' The result is that error pages pollute the index, dilute the crawl budget, and create a disastrous user experience in the SERPs.

What is the difference between a hard 404 and a soft 404?

A hard 404 explicitly returns a 404 HTTP code in the response header. Google immediately understands that it should not index this URL. This is the clean method that leaves no ambiguity.

A soft 404 occurs when the server returns a 200 but the content clearly indicates an error: empty page, 'not found' message, insufficient content. Google attempts to automatically detect these situations through heuristic signals — content size, absence of internal links, linguistic patterns. However, this detection remains imperfect and exposes the risk of false negatives.

How do the three methods proposed by Google differ?

The first method consists of redirecting on the client side to a server URL that returns a real 404 code. For example, if a user types /non-existent-article, JavaScript detects the error and redirects to /404, which the server serves with a 404 HTTP code. It’s clean, explicit, and completely reliable.

The second method adds a <meta name="robots" content="noindex"> tag in the <head> of the client-side error page. The server still returns a 200, but Google reads the tag and refuses to index. This works well if Googlebot can execute JavaScript and read the tag — which is generally the case, but not guaranteed 100% depending on available resources.

The third method, the automatic soft 404, consists of doing nothing special and letting Google detect that the page is empty or useless. Martin Splitt explicitly discourages this because it relies on algorithmic interpretation, which can fail if the error content is too rich or if the page contains structural elements that resemble real content.

An SPA that does not properly handle 404s exposes its index to massive pollution from useless pages.
The 404 HTTP code remains the most reliable signal for Google — always prefer this method when possible.
The noindex tag is a good second choice if you cannot easily modify the server code.
Soft 404s should only be a last resort, and only if the other two methods are technically impossible to implement.
Systematically test your error pages with tools like Google Search Console or Screaming Frog to ensure that the HTTP code or noindex tag is properly served.

SEO Expert opinion

Is this recommendation consistent with field observations?

Yes, totally. We regularly observe SPAs with dozens or even hundreds of indexed error pages simply because the server returns a 200 OK on non-existent routes. Search Console often reports these URLs as 'Indexed, though blocked by robots.txt' or 'Crawled, currently not indexed,' indicating algorithmic confusion.

Sites that implement a real 404 server code never encounter this problem. This is empirical evidence that the method recommended by Google actually works. In contrast, relying on automatic soft 404s exposes you to rogue indexing that can persist for months before Google cleans it up — if it ever does.

What nuances should we add to this statement?

Martin Splitt does not specify a crucial point: Googlebot's JavaScript rendering performance. If your SPA takes 5 seconds to load and Googlebot abandons before the noindex tag is injected, you are in a grey area. [To be verified] on projects with tight crawl budgets or poor JS performance.

Another nuance: client-side redirects (via window.location, for example) are not always treated like standard HTTP redirects by Google. If you redirect to /404 via JavaScript, make sure this URL indeed returns a 404 server code, otherwise, you are just shifting the problem. The redirection must be server or SSR, not purely client-side.

In what cases does this rule not apply or become more complex?

If you're using a modern framework with SSR/SSG (Next.js, Nuxt, SvelteKit), managing 404s is often native and correctly implemented out-of-the-box. These frameworks return server 404 codes by default for non-existent routes, which resolves the problem at its root. But be careful: if you have customized routing logic or are serving client-only rendering, you fall back into the problematic case.

Another complex case involves partially dynamic pages, where part of the content is valid but a subsection is missing. Should you return a 404 for the entire page or just hide the section? Google has never clearly ruled on this. My view: if the main content remains accessible and relevant, keep a 200. If the page loses all sense without the missing resource, switch to 404 or noindex.

Beware: some CDNs or proxies (Cloudflare, Fastly) may hide HTTP codes and serve a 200 by default even if your server returns a 404. Always check the actual HTTP response with curl or a crawl tool, not just in the browser.

Practical impact and recommendations

What concrete steps should be taken to fix 404s in SPAs?

Start with an audit of your non-existent routes. Manually test several random URLs that do not exist on your site, then check the HTTP code returned with a tool like curl -I https://yoursite.com/non-existent-page. If you see a 200, you have a problem.

Next, choose your method. If you have control over the server or are using an SSR framework, implement a redirect to a /404 page that returns a real 404 code. If you're stuck on client-only, inject a <meta name="robots" content="noindex"> tag in the <head> of your error component. Then test with Google Search Console through the URL inspection tool to confirm Googlebot is reading the tag correctly.

What mistakes should be avoided absolutely in this implementation?

Never redirect all your 404s to the homepage with a 301 or 302 code. This is an archaic practice that creates massive soft 404s and dilutes your internal link structure. Google hates this and may even penalize the homepage if it receives too many inconsistent redirects.

Another common mistake: adding a noindex but leaving the page crawlable and linked from menus or footers. The result: Googlebot continues to crawl these URLs in loops, wasting budget for no reason. If you use noindex, ensure that these pages are never linked from your site. Ideally, they should also be nofollow if they contain outgoing links.

How can I verify that my site is compliant after correction?

Run a full crawl with Screaming Frog or Sitebulb, enabling JavaScript rendering. Filter URLs with a 200 code that contain words like 'not found,' 'error,' 'not found' in the <title> or <h1>. These are your soft 404 candidates.

Then check in Google Search Console for the evolution of the number of indexed pages. If you had hundreds of indexed 404s, you should see this number gradually decrease after correction — it can take 2 to 6 weeks depending on crawl frequency. Also, use the 'Coverage' report to spot URLs marked as 'Soft 404' or 'Not Found (404)' and confirm they are being handled as you wish.

Audit the HTTP codes of your non-existent routes with curl or Screaming Frog
Implement a server redirect to /404 with a 404 HTTP code, or add a noindex tag on the client side
Never redirect 404s to the homepage with a 301
Test the implementation with the URL inspection tool in Search Console
Crawl the site with JavaScript rendering to find remaining soft 404s
Monitor the evolution of the number of indexed pages over 4 to 8 weeks

Proper management of 404s in a SPA requires precise technical attention, especially if your stack relies on pure client-side routing. The stakes are high: rogue indexing, wasted crawl budget, and degraded user experience in the SERPs. If your team lacks resources or expertise to implement these corrections reliably, it may be wise to seek support from an SEO agency specialized in JavaScript SEO. A technical audit followed by supervised implementation ensures that every non-existent route is properly handled, without risk of regression or misinterpretation by Googlebot.

❓ Frequently Asked Questions

Pourquoi les soft 404 sont-ils considérés comme moins fiables ?

Parce qu'ils dépendent de l'interprétation automatique par Googlebot du contenu de la page. Si l'algorithme ne détecte pas clairement qu'il s'agit d'une erreur, la page peut être indexée malgré tout.

Peut-on combiner plusieurs méthodes pour gérer les 404 en SPA ?

Oui, mais c'est inutile voire contre-productif. Si vous redirigez vers une URL serveur avec code 404, ajouter un noindex crée une redondance. Choisissez une méthode et appliquez-la de manière cohérente.

La balise noindex empêche-t-elle totalement l'indexation d'une page 404 ?

Oui, si Googlebot peut crawler la page et lire la balise. C'est une directive forte qui bloque l'indexation, mais la page reste techniquement accessible et crawlable.

Faut-il éviter les redirections 301 vers la homepage pour gérer les 404 ?

Absolument. Rediriger systématiquement vers la homepage crée des soft 404 et dilue votre architecture. Utilisez un vrai code 404 ou une redirection vers une page d'erreur dédiée avec le bon code HTTP.

Comment vérifier qu'une SPA renvoie correctement un code 404 côté serveur ?

Testez avec curl ou un outil comme Screaming Frog en mode serveur. Vous devez voir un code HTTP 404 dans l'en-tête de réponse, pas un 200 avec un contenu d'erreur.

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 38 min · published on 18/05/2020

🎥 Watch the full video on YouTube →