Official statement
Other statements from this video 36 ▾
- 1:02 Faut-il ignorer le score Lighthouse pour optimiser son SEO ?
- 1:02 La vitesse de page est-elle vraiment un facteur de classement Google ?
- 1:42 Lighthouse et PageSpeed Insights ne servent-ils vraiment à rien pour le ranking ?
- 2:38 Les Web Vitals de Google modélisent-ils vraiment l'expérience utilisateur ?
- 3:40 La vitesse de page est-elle vraiment un facteur de ranking aussi décisif qu'on le prétend ?
- 7:07 Faut-il vraiment injecter la balise canonical via JavaScript ?
- 7:27 Peut-on vraiment injecter la balise canonical via JavaScript sans risque SEO ?
- 8:28 Google Tag Manager ralentit-il vraiment votre site et faut-il l'abandonner ?
- 8:31 GTM sabote-t-il vraiment votre temps de chargement ?
- 9:35 Servir un 404 à Googlebot et un 200 aux visiteurs est-il vraiment du cloaking ?
- 10:06 Servir un 404 à Googlebot et un 200 aux utilisateurs, est-ce vraiment du cloaking ?
- 16:16 Les redirections 301, 302 et JavaScript sont-elles vraiment équivalentes pour le SEO ?
- 16:58 Les redirections JavaScript sont-elles vraiment équivalentes aux 301 pour Google ?
- 17:18 Le rendu côté serveur est-il vraiment indispensable pour le référencement Google ?
- 17:58 Faut-il vraiment investir dans le server-side rendering pour le SEO ?
- 20:02 L'état applicatif en JSON dans le DOM crée-t-il du contenu dupliqué ?
- 20:24 Cloudflare Rocket Loader passe-t-il le test SEO de Googlebot ?
- 20:44 Faut-il tester Cloudflare Rocket Loader et les outils tiers avant de les activer pour le SEO ?
- 21:58 Faut-il ignorer les erreurs 'Other Error' dans Search Console et Mobile Friendly Test ?
- 23:18 Faut-il vraiment s'inquiéter du statut 'Other Error' dans les outils de test Google ?
- 27:58 Faut-il choisir un framework JavaScript plutôt qu'un autre pour son SEO ?
- 31:27 Le JavaScript consomme-t-il vraiment du crawl budget ?
- 31:32 Le rendering JavaScript consomme-t-il du crawl budget ?
- 33:07 Faut-il abandonner le dynamic rendering pour le SEO ?
- 33:17 Faut-il vraiment abandonner le dynamic rendering pour le référencement ?
- 34:01 Faut-il vraiment abandonner le JavaScript côté client pour l'indexation des liens produits ?
- 34:21 Le JavaScript asynchrone post-load bloque-t-il vraiment l'indexation Google ?
- 36:05 Faut-il vraiment passer sur un serveur dédié pour améliorer son SEO ?
- 36:25 Serveur mutualisé ou dédié : Google fait-il vraiment la différence ?
- 40:06 L'hydration côté client pose-t-elle vraiment un problème SEO ?
- 40:06 L'hydratation SSR + client est-elle vraiment sans danger pour le SEO Google ?
- 42:12 Faut-il arrêter de surveiller le score Lighthouse global pour se concentrer sur les métriques Core Web Vitals pertinentes à son site ?
- 42:47 Faut-il vraiment viser 100 sur Lighthouse ou est-ce une perte de temps ?
- 45:24 La 5G va-t-elle vraiment accélérer votre site ou est-ce une illusion ?
- 49:09 Googlebot ignore-t-il vraiment vos images WebP servies via Service Workers ?
- 49:09 Pourquoi Googlebot ignore-t-il vos images WebP servies par Service Worker ?
Google confirms that serializing application state in JSON within the DOM (a common practice in SSR) does not constitute penalizing duplicate content. Only the visible DOM rendering is indexed, not the embedded JSON in script tags. For React, Vue, or Angular sites with SSR, this means you can continue to hydrate state without fear of semantic dilution or penalties.
What you need to understand
Why is this question arising in the first place?
When you're doing Server-Side Rendering (SSR) with a modern framework, your server generates a complete HTML page with the content already rendered. Up to this point, nothing complicated.
However, for your app to take over on the client side without re-fetching everything, you serialize the application state (the data used for rendering) in a <script type="application/json"> tag. The result: the same content appears twice in the source HTML — once in the visible DOM, once in the JSON.
And naturally, SEO practitioners wondered whether Google would consider this as internal duplicate content, with all the implications: semantic dilution, potential cannibalization, or even a spam signal in extreme cases.
What exactly does Google say about this?
Martin Splitt cuts to the chase: Google only indexes the rendered DOM, not the serialized JSON. Even though the crawler technically sees both, only the visible content in the DOM tree after parsing counts for indexing.
In practical terms, if your SSR injects a 50 KB JSON block with all your React props, Google completely ignores it for semantic ranking. It looks at what displays in the browser after hydration, end of story.
This is a welcome clarification because it removes uncertainty that led some to fiddle with suboptimal solutions — like loading state via a separate endpoint, which disrupts user experience and increases Time to Interactive.
What are the technical implications for a production site?
If you're working on a Next.js, Nuxt, or SvelteKit site with SSR, you can continue to use __NEXT_DATA__, __NUXT__, or the equivalent without worrying. These mechanisms are specifically designed for this: transferring server state to the client.
However, that doesn’t mean anything goes. If your serialized JSON contains sensitive data (tokens, private user info), it remains visible in the HTML source. This is a security issue, not an SEO one, but it's good to keep in mind.
Another nuance: Google does not see the JSON for indexing, but it can still crawl and store it. If you're dumping 200 KB of JSON on every page, it consumes crawl budget for nothing. Optimize the size of your payload even if it doesn't directly impact ranking.
- Google only indexes the rendered DOM, never the JSON serialized in script tags.
- Modern frameworks (Next, Nuxt, Gatsby, etc.) can continue to hydrate the state without fear of penalizing duplication.
- Watch out for sensitive data in the JSON: it remains exposed in the HTML source.
- An oversized JSON payload can impact crawl budget even if it doesn’t affect semantic indexing.
- This rule applies to <script type="application/json"> tags or similar, not to executable scripts that could modify the DOM.
SEO Expert opinion
Does this statement align with what we observe on the ground?
Honestly? Yes, and it’s consistent with what we've known about Google's rendering pipeline for years. Googlebot parses the HTML, builds the DOM, executes JS if necessary, and indexes the final result. Non-executable script tags (like type="application/json") are ignored.
What’s new is that Martin Splitt formalizes this clearly. Previously, we had to deduce this behavior from empirical tests and scattered bits of information. Now, we have an official position — and it changes everything for SPA/SSR projects that were still hesitating.
However, I have seen cases where developers serialized JSON in visible tags (like a <div style="display:none"> with JSON inside). In this case, Google can indeed see it as hidden content, and that could pose a problem. The key is that the JSON should be in a non-DOM context (script tag).
What limits should you keep in mind?
First, this rule only applies to content serialized in script tags. If you're duplicating your content elsewhere — like in oversized data-* attributes, within hidden HTML comments, or in hidden iframes — that's another story.
Next, be careful not to confuse "not a problem for indexing" with "no performance impact." A 300 KB JSON slows down Time to First Byte, increases page size, and can degrade Core Web Vitals. It’s not penalized as duplicate content, but it can hurt you on other criteria.
Last point: Google says it doesn’t index JSON, but what about other engines? Do Bing, Yandex, Baidu follow the same logic? [To be verified] — we don't have an equivalent official statement from them. If you're optimizing for a multilingual market or alternative engines, keep a margin of caution.
Are there cases where this rule no longer holds?
Yes. If your serialized JSON contains structured content different from what displays in the DOM, it can create inconsistencies. For example, if your JSON lists 50 products but your DOM only displays 10, Google will index the 10, not the 50.
Another edge case: sites using JSON-LD for Schema.org markup. In this case, it’s intended — Google reads this JSON to extract structured data. But if you mix valid JSON-LD with serialized application JSON, make sure both remain in separate, correctly typed tags.
Practical impact and recommendations
What should you check on your site?
First step: open the source HTML (Ctrl+U or view source) of your SSR pages and locate the <script> tags containing your serialized state. Check that they have a type="application/json" attribute or equivalent — never type="text/javascript" if it’s just passive JSON.
Next, use the URL inspection tool in Search Console and compare the rendered DOM ("More info" tab > "Rendered page") with your source HTML. If you see major differences between the two, it means your hydration modifies the content — and that’s a potential problem.
Also check the size of your JSON payload. If a page weighs 150 KB with 100 KB of serialized JSON, you have an architectural problem. Optimize by only serializing the data strictly necessary for hydration, not your entire Redux or Vuex state.
What mistakes should you absolutely avoid?
Never serialize your JSON in a visible DOM element (even hidden with CSS). This includes <div style="display:none">, misused <template>, or oversized data-* attributes. Google can interpret this as cloaking or hidden content, with the penalties that come with it.
Also avoid serializing redundant data. If your JSON contains the exact same text as your DOM, word for word, you're wasting bandwidth and crawl budget. The idea is to serialize the application state (IDs, flags, small props), not to re-duplicate all the textual content.
Last classic pitfall: badly encoded inline scripts. If your JSON contains special characters (quotes, angle brackets, slashes) and you inject it incorrectly without escaping, you risk breaking the HTML or opening XSS vulnerabilities. Use your framework’s escaping functions (serialize-javascript, JSON.stringify + escape, etc.).
How can you ensure everything is working correctly?
Run an audit with Screaming Frog or Sitebulb with JavaScript mode enabled. Check that the indexable content (titles, text, structured data) is identical between the source HTML and the rendered DOM. If you see differences, dig deeper: it’s either a hydration issue or an SSR bug.
Also use Lighthouse or WebPageTest to measure the impact of serialized JSON on performance. If your Time to Interactive explodes due to a massive JSON payload, optimization is needed — either by lazy-loading certain data or moving state server-side (sessions, cookies).
Finally, test with varied User-Agents. Google says it ignores JSON, but what about Googlebot-Mobile vs Desktop? Third-party bots? A quick test with curl -A "Googlebot" will show you exactly what the crawler receives.
- Ensure your serialized JSON is in
Get real-time analysis of the latest Google SEO declarations
Be the first to know every time a new official Google statement drops — with full expert analysis.
No spam. Unsubscribe in one click.