Is serving a 404 to Googlebot while showing a 200 to visitors really cloaking?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

In a pre-rendered React SPA, serving an HTTP 404 code to Googlebot (via pre-render) while the user sees a 200 error page is generally not considered cloaking, unless you are doing something really dubious. If the 200 page for the user is also an error page, Google will detect it as a soft 404.

9:35

🎥 Source video

Extracted from a Google Search Central video

⏱ 51:17 💬 EN 📅 12/05/2020 ✂ 37 statements

Watch on YouTube (9:35) →

✂ Other statements from this video 36 ▾

📅

Official statement from May 12, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Is returning HTTP 200 on a 404 page really cloaking or just a soft 404? Gary Illyes · December 18, 2023 View statement →

TL;DR

Google allows a pre-rendered React SPA to serve an HTTP 404 code to Googlebot while the user sees a 200 displaying an error. This is generally not considered cloaking, unless there is clear manipulation. If the 200 page for the user is indeed an error page, Google will detect it as a soft 404 anyway.

What you need to understand

Why is there tolerance for divergent HTTP codes?

In modern Single Page Applications, pre-rendering has become essential for facilitating indexing. Tools like Prerender.io or Rendertron intercept Googlebot's requests and serve them static HTML, while human visitors load the client-side JavaScript application.

The issue arises when a route does not exist: the SPA visually displays an error message with a 200 code, while the pre-rendering sends a 404 to Googlebot. Technically, this involves serving different content based on the user-agent — the classic definition of cloaking. However, Google recognizes that the intent here is not fraudulent.

This clarification addresses a legitimate anxiety: many developers fear that optimizing the Googlebot experience through pre-rendering may trigger a manual penalty. Martin Splitt clarifies this ambiguity for standard cases.

What defines the line between optimization and manipulation?

The critical nuance lies in the expression ‘something really dubious’. Google does not precisely define this threshold, but the context suggests that it pertains to cases where the 200 content for the user would be rich and functional, while Googlebot would systematically receive 404s to conceal entire sections.

If the user indeed sees an error page (“Page not found”, “This resource no longer exists”), then serving a 404 to Googlebot aligns with the reality of the user experience. It is even more honest than allowing Google to index an empty 200 that would trigger a soft 404.

How does Google detect soft 404s?

A soft 404 occurs when a server returns a 200 code for a page that logically should be a 404 — very sparse content, a visible error message, degraded UX signals. Google uses content heuristics: text/HTML ratio, linguistic patterns typical of errors, absence of usual structural elements.

In the case of a SPA, if the 200 page served to visitors is indeed an error, Google will classify it as a soft 404 even without receiving the appropriate HTTP code. That is why serving the true 404 to Googlebot via pre-rendering is actually an improvement: you align the HTTP signal with the content reality.

Pre-rendered 404 for Googlebot + visual error 200 page for the user: tolerated by Google, considered a legitimate technical optimization.
Soft 404: detected through content analysis, not just the HTTP code — Google identifies empty pages or error messages even with a 200.
Cloaking prohibited: hiding existing content from Googlebot by serving systematic 404s or showing rich content to the bot while displaying an error to visitors.
Vague threshold: Google does not clarify ‘really dubious’ — caution dictates documenting any divergence and ensuring it accurately reflects the user experience.

SEO Expert opinion

Is this statement consistent with field observations?

Overall, yes. For years, sites using pre-rendering with differentiated HTTP codes have not faced visible manual penalties, as long as the user experience remains consistent. Comparison crawls between Googlebot and standard browsers show that Google tolerates these technical discrepancies when they support indexing.

However, the phrase ‘something really dubious’ remains vague. [To be verified]: no numerical metrics, no precise tolerance threshold is provided. We remain in the subjective judgment of the Quality Raters team or anti-spam algorithms. A site may fly under the radar for months and then change if usage patterns shift.

What gray areas should be monitored?

The real danger is not a legitimate SPA 404, but a gradual drift: a developer starting to serve 404s for marginal pages ‘just to see’, then extending the practice to entire categories. Or worse, serving a rich 200 to the visitor and a 404 to Googlebot to control indexing without using robots.txt.

Another trap: pre-rendering configuration errors. I’ve seen sites where the pre-rendering service served 404s by default due to a poorly calibrated timeout, while the JavaScript page eventually finished loading on the client-side. Google can interpret this as instability, or even manipulation if the pattern is systematic.

Attention: If your pre-rendering serves 404s for pages that truly exist on the client-side and are not errors, you cross the red line. Google can detect this through UX metrics (session time, bounce rate, navigation) and via random headless Chrome crawls.

In what cases does this rule not apply?

This tolerance only concerns true error pages. If you serve a 404 to Googlebot for an active product page, a service sheet, or a blog post, that is pure cloaking — even if the user sees a 200 with content.

Similarly, if you use this technique to hide duplicate content or low-quality pages that you do not want to index, Google might see it as a manipulation attempt. The best practice remains using noindex or canonical, not a selective 404 based on user-agent.

Practical impact and recommendations

What concrete steps should you take to stay compliant?

First, audit your pre-rendering: list all the routes serving a different HTTP code to Googlebot compared to users. Document each case with a clear justification — ‘page removed’, ‘invalid parameter’, ‘resource never created’. If you cannot justify the divergence in 10 seconds, it is probably a red flag.

Next, test the visual consistency: navigate to the URLs that return a 404 to Googlebot. Does the user actually see an error message, a broken layout, or empty content? If the page displays useful content, align the HTTP code — serve a 200 everywhere or a 404 everywhere.

What mistakes should you absolutely avoid?

Never create a whitelist/blacklist based on user-agent to serve strategic 404s. This is exactly the pattern that anti-spam algorithms detect. If you need to block indexing, use the robots.txt file, the noindex meta tag, or the X-Robots-Tag header.

Avoid the default 404: some pre-rendering services are configured to return 404s in case of timeout or JavaScript errors. This can mask real bugs and create an illegitimate gap with the user experience. Configure generous timeouts and log rendering errors.

How can I check if my implementation is compliant?

Use the URL Inspection tool in Google Search Console to compare the version rendered by Googlebot with the one viewed in a standard browser. Look for discrepancies in HTTP codes, but also in content — a massive gap signals a problem.

Monitor the index coverage reports for spikes in soft 404s. If Google classifies many of your pages as soft 404s while you are serving clean 404s via pre-rendering, that’s a good sign — it confirms that both signals align. Conversely, if you see soft 404s without having sent an HTTP 404, dig deeper.

Document each route serving a different HTTP code to Googlebot versus users, with clear business justification.
Manually test pre-rendered URLs as 404s: does the user actually see a visual error page?
Configure generous timeouts on the pre-rendering service to avoid accidental 404s due to JavaScript delays.
Use the URL Inspection tool in Search Console to compare Googlebot rendering and standard browser rendering.
Monitor coverage reports to spot anomalies (mass soft 404s, unexpected de-indexing).
Avoid any whitelist/blacklist user-agent logic — prefer robots.txt, noindex, or canonical to control indexing.

Martin Splitt's statement explicitly allows a common technical pattern in SPA architectures, provided that the HTTP code discrepancy accurately reflects the user experience. Specifically, if your page is a true error for the visitor, serving a clean 404 to Googlebot via pre-rendering is actually recommended — it’s more honest than a soft 404. However, implementing this requires a nuanced understanding of pre-rendering mechanisms, monitoring tools, and Search Console signals. If you manage a complex SPA with thousands of dynamic routes, auditing and configuration can quickly become time-consuming. In this context, working with an SEO agency specializing in JavaScript and modern architectures can accelerate compliance and avoid costly errors that impact indexing.

❓ Frequently Asked Questions

Servir un 404 à Googlebot et un 200 aux visiteurs est-il toujours autorisé ?

Oui, si la page 200 pour l'utilisateur est effectivement une page d'erreur visuelle. Google tolère cet écart quand il reflète fidèlement l'expérience utilisateur, pas quand il sert à masquer du contenu.

Qu'est-ce qu'un soft 404 et comment Google le détecte-t-il ?

Un soft 404 est une page qui renvoie un code 200 mais affiche un contenu d'erreur. Google le détecte via des heuristiques de contenu : ratio texte/HTML faible, patterns linguistiques typiques des erreurs, signaux UX dégradés.

Puis-je utiliser cette technique pour désindexer des pages low-quality ?

Non, c'est considéré comme une manipulation. Pour contrôler l'indexation, utilisez robots.txt, la balise meta noindex ou le header X-Robots-Tag, pas un 404 sélectif selon le user-agent.

Comment vérifier que mon pré-rendu ne crée pas de cloaking involontaire ?

Utilisez l'Inspection d'URL dans Search Console pour comparer le rendu Googlebot et navigateur standard. Surveillez aussi les rapports de couverture pour détecter des soft 404 inattendus.

Quels sont les risques si je configure mal mon service de pré-rendu ?

Un timeout trop court peut servir des 404 accidentels à Googlebot pour des pages valides, créant un écart illégitime avec l'expérience utilisateur. Google peut interpréter ça comme instabilité ou manipulation.

🏷 Related Topics

cloaking SPA pré-rendu code HTTP soft 404 indexation Googlebot React

Domain Age & History Crawl & Indexing HTTPS & Security AI & SEO JavaScript & Technical SEO Penalties & Spam

🎥 From the same video 36

Other SEO insights extracted from this same Google Search Central video · duration 51 min · published on 12/05/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Returning a 404 code via dynamic rendering isn't c...

5G Probably Won't Sustainably Improve Web Performa...

« Back to results