Official statement
Other statements from this video 9 ▾
- 1:06 Le dynamic rendering est-il vraiment sans risque pour le SEO ?
- 1:38 Le dynamic rendering ralentit-il vraiment votre serveur ou améliore-t-il le crawl budget ?
- 2:39 Pourquoi Google traite-t-il les redirections JavaScript comme des 302 et non des 301 ?
- 2:39 Google fait-il vraiment une différence entre redirections 301 et 302 pour le SEO ?
- 3:42 Googlebot peut-il vraiment crawler les liens cachés dans un menu hamburger ?
- 5:46 Faut-il servir des pages allégées aux bots pour améliorer les performances ?
- 14:57 Pourquoi Googlebot rate-t-il vos contenus chargés par Web Workers ?
- 30:51 Le contenu masqué dans les accordéons est-il vraiment indexé par Google ?
- 31:49 Faut-il vraiment abandonner l'implémentation manuelle du structured data ?
Google offers three methods for signaling a 404 in a SPA with client-side routing: server-side redirect to a page with a 404 code, adding a noindex tag, or automatic soft 404 detection. The first two solutions ensure that no error pages will be mistakenly indexed. The third method, soft 404, remains less reliable and carries a risk of undesirable indexing, according to Martin Splitt.
What you need to understand
What specific problems do SPAs pose with 404 errors?
Single Page Applications (SPAs) manage navigation on the client side via JavaScript, without fully reloading the page. When a user accesses a non-existent route, the server often returns a 200 HTTP code with the app's shell, and then JavaScript detects the error and displays a client-side message.
For Googlebot, this is a real puzzle. The crawler sees a page that responds with 200 OK, and it may potentially index it, even if the displayed content says 'Page not found.' The result is that error pages pollute the index, dilute the crawl budget, and create a disastrous user experience in the SERPs.
What is the difference between a hard 404 and a soft 404?
A hard 404 explicitly returns a 404 HTTP code in the response header. Google immediately understands that it should not index this URL. This is the clean method that leaves no ambiguity.
A soft 404 occurs when the server returns a 200 but the content clearly indicates an error: empty page, 'not found' message, insufficient content. Google attempts to automatically detect these situations through heuristic signals — content size, absence of internal links, linguistic patterns. However, this detection remains imperfect and exposes the risk of false negatives.
How do the three methods proposed by Google differ?
The first method consists of redirecting on the client side to a server URL that returns a real 404 code. For example, if a user types /non-existent-article, JavaScript detects the error and redirects to /404, which the server serves with a 404 HTTP code. It’s clean, explicit, and completely reliable.
The second method adds a <meta name="robots" content="noindex"> tag in the <head> of the client-side error page. The server still returns a 200, but Google reads the tag and refuses to index. This works well if Googlebot can execute JavaScript and read the tag — which is generally the case, but not guaranteed 100% depending on available resources.
The third method, the automatic soft 404, consists of doing nothing special and letting Google detect that the page is empty or useless. Martin Splitt explicitly discourages this because it relies on algorithmic interpretation, which can fail if the error content is too rich or if the page contains structural elements that resemble real content.
- An SPA that does not properly handle 404s exposes its index to massive pollution from useless pages.
- The 404 HTTP code remains the most reliable signal for Google — always prefer this method when possible.
- The noindex tag is a good second choice if you cannot easily modify the server code.
- Soft 404s should only be a last resort, and only if the other two methods are technically impossible to implement.
- Systematically test your error pages with tools like Google Search Console or Screaming Frog to ensure that the HTTP code or noindex tag is properly served.
SEO Expert opinion
Is this recommendation consistent with field observations?
Yes, totally. We regularly observe SPAs with dozens or even hundreds of indexed error pages simply because the server returns a 200 OK on non-existent routes. Search Console often reports these URLs as 'Indexed, though blocked by robots.txt' or 'Crawled, currently not indexed,' indicating algorithmic confusion.
Sites that implement a real 404 server code never encounter this problem. This is empirical evidence that the method recommended by Google actually works. In contrast, relying on automatic soft 404s exposes you to rogue indexing that can persist for months before Google cleans it up — if it ever does.
What nuances should we add to this statement?
Martin Splitt does not specify a crucial point: Googlebot's JavaScript rendering performance. If your SPA takes 5 seconds to load and Googlebot abandons before the noindex tag is injected, you are in a grey area. [To be verified] on projects with tight crawl budgets or poor JS performance.
Another nuance: client-side redirects (via window.location, for example) are not always treated like standard HTTP redirects by Google. If you redirect to /404 via JavaScript, make sure this URL indeed returns a 404 server code, otherwise, you are just shifting the problem. The redirection must be server or SSR, not purely client-side.
In what cases does this rule not apply or become more complex?
If you're using a modern framework with SSR/SSG (Next.js, Nuxt, SvelteKit), managing 404s is often native and correctly implemented out-of-the-box. These frameworks return server 404 codes by default for non-existent routes, which resolves the problem at its root. But be careful: if you have customized routing logic or are serving client-only rendering, you fall back into the problematic case.
Another complex case involves partially dynamic pages, where part of the content is valid but a subsection is missing. Should you return a 404 for the entire page or just hide the section? Google has never clearly ruled on this. My view: if the main content remains accessible and relevant, keep a 200. If the page loses all sense without the missing resource, switch to 404 or noindex.
Practical impact and recommendations
What concrete steps should be taken to fix 404s in SPAs?
Start with an audit of your non-existent routes. Manually test several random URLs that do not exist on your site, then check the HTTP code returned with a tool like curl -I https://yoursite.com/non-existent-page. If you see a 200, you have a problem.
Next, choose your method. If you have control over the server or are using an SSR framework, implement a redirect to a /404 page that returns a real 404 code. If you're stuck on client-only, inject a <meta name="robots" content="noindex"> tag in the <head> of your error component. Then test with Google Search Console through the URL inspection tool to confirm Googlebot is reading the tag correctly.
What mistakes should be avoided absolutely in this implementation?
Never redirect all your 404s to the homepage with a 301 or 302 code. This is an archaic practice that creates massive soft 404s and dilutes your internal link structure. Google hates this and may even penalize the homepage if it receives too many inconsistent redirects.
Another common mistake: adding a noindex but leaving the page crawlable and linked from menus or footers. The result: Googlebot continues to crawl these URLs in loops, wasting budget for no reason. If you use noindex, ensure that these pages are never linked from your site. Ideally, they should also be nofollow if they contain outgoing links.
How can I verify that my site is compliant after correction?
Run a full crawl with Screaming Frog or Sitebulb, enabling JavaScript rendering. Filter URLs with a 200 code that contain words like 'not found,' 'error,' 'not found' in the <title> or <h1>. These are your soft 404 candidates.
Then check in Google Search Console for the evolution of the number of indexed pages. If you had hundreds of indexed 404s, you should see this number gradually decrease after correction — it can take 2 to 6 weeks depending on crawl frequency. Also, use the 'Coverage' report to spot URLs marked as 'Soft 404' or 'Not Found (404)' and confirm they are being handled as you wish.
- Audit the HTTP codes of your non-existent routes with curl or Screaming Frog
- Implement a server redirect to
/404with a 404 HTTP code, or add a noindex tag on the client side - Never redirect 404s to the homepage with a 301
- Test the implementation with the URL inspection tool in Search Console
- Crawl the site with JavaScript rendering to find remaining soft 404s
- Monitor the evolution of the number of indexed pages over 4 to 8 weeks
❓ Frequently Asked Questions
Pourquoi les soft 404 sont-ils considérés comme moins fiables ?
Peut-on combiner plusieurs méthodes pour gérer les 404 en SPA ?
La balise noindex empêche-t-elle totalement l'indexation d'une page 404 ?
Faut-il éviter les redirections 301 vers la homepage pour gérer les 404 ?
Comment vérifier qu'une SPA renvoie correctement un code 404 côté serveur ?
🎥 From the same video 9
Other SEO insights extracted from this same Google Search Central video · duration 38 min · published on 18/05/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.