Le JSON sérialisé dans une balise script compte-t-il dans la limite de taille de page pour le crawl ?

Oui, il compte dans la taille totale du HTML téléchargé par Googlebot. Un JSON trop lourd peut consommer du crawl budget inutilement, même s'il n'est pas indexé. Optimise la taille du payload pour limiter ce gaspillage.

Si mon JSON contient des liens, sont-ils suivis par Googlebot ?

Non. Les liens dans un JSON non exécutable (type="application/json") ne sont pas extraits ni suivis par Googlebot. Seuls les liens présents dans le DOM rendu comptent pour le crawl et le PageRank.

Puis-je sérialiser du contenu SEO-critique uniquement dans le JSON et le charger en JS côté client ?

Non, mauvaise idée. Si ton contenu clé n'apparaît que dans le JSON et est injecté après coup en JS, Google risque de ne pas le voir ou de le voir avec retard. Toujours rendre le contenu SEO-critique dans le DOM initial.

Cette règle s'applique-t-elle aussi aux balises template en HTML5 ?

Les balises <template> ne sont pas rendues par défaut dans le DOM, donc leur contenu n'est pas indexé non plus. Si tu caches du contenu SEO là-dedans, Google ne le verra pas. À éviter pour le contenu indexable.

Bing et les autres moteurs ont-ils la même approche que Google sur le JSON sérialisé ?

On n'a pas de déclaration officielle équivalente chez Bing, Yandex ou Baidu. Par prudence, assume que la logique peut varier. Teste avec leurs outils de validation (Bing Webmaster Tools, etc.) pour vérifier le comportement.

Does serialized JSON in your JavaScript apps count as duplicate content?

Official statement

During SSR, application state is often serialized in JSON on the page, which duplicates the content once in the JSON and once in the DOM. Google does not consider this problematic duplicate content because only the DOM is taken into account for indexing.

19:22

🎥 Source video

Extracted from a Google Search Central video

⏱ 51:17 💬 EN 📅 12/05/2020 ✂ 37 statements

Watch on YouTube (19:22) →

✂ Other statements from this video 36 ▾

1:02 Faut-il ignorer le score Lighthouse pour optimiser son SEO ?
1:02 La vitesse de page est-elle vraiment un facteur de classement Google ?
1:42 Lighthouse et PageSpeed Insights ne servent-ils vraiment à rien pour le ranking ?
2:38 Les Web Vitals de Google modélisent-ils vraiment l'expérience utilisateur ?
3:40 La vitesse de page est-elle vraiment un facteur de ranking aussi décisif qu'on le prétend ?
7:07 Faut-il vraiment injecter la balise canonical via JavaScript ?
7:27 Peut-on vraiment injecter la balise canonical via JavaScript sans risque SEO ?
8:28 Google Tag Manager ralentit-il vraiment votre site et faut-il l'abandonner ?
8:31 GTM sabote-t-il vraiment votre temps de chargement ?
9:35 Servir un 404 à Googlebot et un 200 aux visiteurs est-il vraiment du cloaking ?
10:06 Servir un 404 à Googlebot et un 200 aux utilisateurs, est-ce vraiment du cloaking ?
16:16 Les redirections 301, 302 et JavaScript sont-elles vraiment équivalentes pour le SEO ?
16:58 Les redirections JavaScript sont-elles vraiment équivalentes aux 301 pour Google ?
17:18 Le rendu côté serveur est-il vraiment indispensable pour le référencement Google ?
17:58 Faut-il vraiment investir dans le server-side rendering pour le SEO ?
20:02 L'état applicatif en JSON dans le DOM crée-t-il du contenu dupliqué ?
20:24 Cloudflare Rocket Loader passe-t-il le test SEO de Googlebot ?
20:44 Faut-il tester Cloudflare Rocket Loader et les outils tiers avant de les activer pour le SEO ?
21:58 Faut-il ignorer les erreurs 'Other Error' dans Search Console et Mobile Friendly Test ?
23:18 Faut-il vraiment s'inquiéter du statut 'Other Error' dans les outils de test Google ?
27:58 Faut-il choisir un framework JavaScript plutôt qu'un autre pour son SEO ?
31:27 Le JavaScript consomme-t-il vraiment du crawl budget ?
31:32 Le rendering JavaScript consomme-t-il du crawl budget ?
33:07 Faut-il abandonner le dynamic rendering pour le SEO ?
33:17 Faut-il vraiment abandonner le dynamic rendering pour le référencement ?
34:01 Faut-il vraiment abandonner le JavaScript côté client pour l'indexation des liens produits ?
34:21 Le JavaScript asynchrone post-load bloque-t-il vraiment l'indexation Google ?
36:05 Faut-il vraiment passer sur un serveur dédié pour améliorer son SEO ?
36:25 Serveur mutualisé ou dédié : Google fait-il vraiment la différence ?
40:06 L'hydration côté client pose-t-elle vraiment un problème SEO ?
40:06 L'hydratation SSR + client est-elle vraiment sans danger pour le SEO Google ?
42:12 Faut-il arrêter de surveiller le score Lighthouse global pour se concentrer sur les métriques Core Web Vitals pertinentes à son site ?
42:47 Faut-il vraiment viser 100 sur Lighthouse ou est-ce une perte de temps ?
45:24 La 5G va-t-elle vraiment accélérer votre site ou est-ce une illusion ?
49:09 Googlebot ignore-t-il vraiment vos images WebP servies via Service Workers ?
49:09 Pourquoi Googlebot ignore-t-il vos images WebP servies par Service Worker ?

What you need to understand

Why is this question arising in the first place?

When you're doing Server-Side Rendering (SSR) with a modern framework, your server generates a complete HTML page with the content already rendered. Up to this point, nothing complicated.

However, for your app to take over on the client side without re-fetching everything, you serialize the application state (the data used for rendering) in a <script type="application/json"> tag. The result: the same content appears twice in the source HTML — once in the visible DOM, once in the JSON.

And naturally, SEO practitioners wondered whether Google would consider this as internal duplicate content, with all the implications: semantic dilution, potential cannibalization, or even a spam signal in extreme cases.

What exactly does Google say about this?

Martin Splitt cuts to the chase: Google only indexes the rendered DOM, not the serialized JSON. Even though the crawler technically sees both, only the visible content in the DOM tree after parsing counts for indexing.

In practical terms, if your SSR injects a 50 KB JSON block with all your React props, Google completely ignores it for semantic ranking. It looks at what displays in the browser after hydration, end of story.

This is a welcome clarification because it removes uncertainty that led some to fiddle with suboptimal solutions — like loading state via a separate endpoint, which disrupts user experience and increases Time to Interactive.

What are the technical implications for a production site?

If you're working on a Next.js, Nuxt, or SvelteKit site with SSR, you can continue to use __NEXT_DATA__, __NUXT__, or the equivalent without worrying. These mechanisms are specifically designed for this: transferring server state to the client.

However, that doesn’t mean anything goes. If your serialized JSON contains sensitive data (tokens, private user info), it remains visible in the HTML source. This is a security issue, not an SEO one, but it's good to keep in mind.

Another nuance: Google does not see the JSON for indexing, but it can still crawl and store it. If you're dumping 200 KB of JSON on every page, it consumes crawl budget for nothing. Optimize the size of your payload even if it doesn't directly impact ranking.

Google only indexes the rendered DOM, never the JSON serialized in script tags.
Modern frameworks (Next, Nuxt, Gatsby, etc.) can continue to hydrate the state without fear of penalizing duplication.
Watch out for sensitive data in the JSON: it remains exposed in the HTML source.
An oversized JSON payload can impact crawl budget even if it doesn’t affect semantic indexing.
This rule applies to <script type="application/json"> tags or similar, not to executable scripts that could modify the DOM.

SEO Expert opinion

Does this statement align with what we observe on the ground?

Honestly? Yes, and it’s consistent with what we've known about Google's rendering pipeline for years. Googlebot parses the HTML, builds the DOM, executes JS if necessary, and indexes the final result. Non-executable script tags (like type="application/json") are ignored.

What’s new is that Martin Splitt formalizes this clearly. Previously, we had to deduce this behavior from empirical tests and scattered bits of information. Now, we have an official position — and it changes everything for SPA/SSR projects that were still hesitating.

However, I have seen cases where developers serialized JSON in visible tags (like a <div style="display:none"> with JSON inside). In this case, Google can indeed see it as hidden content, and that could pose a problem. The key is that the JSON should be in a non-DOM context (script tag).

What limits should you keep in mind?

First, this rule only applies to content serialized in script tags. If you're duplicating your content elsewhere — like in oversized data-* attributes, within hidden HTML comments, or in hidden iframes — that's another story.

Next, be careful not to confuse "not a problem for indexing" with "no performance impact." A 300 KB JSON slows down Time to First Byte, increases page size, and can degrade Core Web Vitals. It’s not penalized as duplicate content, but it can hurt you on other criteria.

Last point: Google says it doesn’t index JSON, but what about other engines? Do Bing, Yandex, Baidu follow the same logic? [To be verified] — we don't have an equivalent official statement from them. If you're optimizing for a multilingual market or alternative engines, keep a margin of caution.

Are there cases where this rule no longer holds?

Yes. If your serialized JSON contains structured content different from what displays in the DOM, it can create inconsistencies. For example, if your JSON lists 50 products but your DOM only displays 10, Google will index the 10, not the 50.

Another edge case: sites using JSON-LD for Schema.org markup. In this case, it’s intended — Google reads this JSON to extract structured data. But if you mix valid JSON-LD with serialized application JSON, make sure both remain in separate, correctly typed tags.

Note: If your SSR generates different content on the server and client side (e.g., an incomplete server rendering followed by hydration that loads more content), Google may see a depleted version. Ensure that your final DOM after hydration matches what you want to index.

Practical impact and recommendations

What should you check on your site?

First step: open the source HTML (Ctrl+U or view source) of your SSR pages and locate the <script> tags containing your serialized state. Check that they have a type="application/json" attribute or equivalent — never type="text/javascript" if it’s just passive JSON.

Next, use the URL inspection tool in Search Console and compare the rendered DOM ("More info" tab > "Rendered page") with your source HTML. If you see major differences between the two, it means your hydration modifies the content — and that’s a potential problem.

Also check the size of your JSON payload. If a page weighs 150 KB with 100 KB of serialized JSON, you have an architectural problem. Optimize by only serializing the data strictly necessary for hydration, not your entire Redux or Vuex state.

What mistakes should you absolutely avoid?

Never serialize your JSON in a visible DOM element (even hidden with CSS). This includes <div style="display:none">, misused <template>, or oversized data-* attributes. Google can interpret this as cloaking or hidden content, with the penalties that come with it.

Also avoid serializing redundant data. If your JSON contains the exact same text as your DOM, word for word, you're wasting bandwidth and crawl budget. The idea is to serialize the application state (IDs, flags, small props), not to re-duplicate all the textual content.

Last classic pitfall: badly encoded inline scripts. If your JSON contains special characters (quotes, angle brackets, slashes) and you inject it incorrectly without escaping, you risk breaking the HTML or opening XSS vulnerabilities. Use your framework’s escaping functions (serialize-javascript, JSON.stringify + escape, etc.).

How can you ensure everything is working correctly?

Run an audit with Screaming Frog or Sitebulb with JavaScript mode enabled. Check that the indexable content (titles, text, structured data) is identical between the source HTML and the rendered DOM. If you see differences, dig deeper: it’s either a hydration issue or an SSR bug.

Also use Lighthouse or WebPageTest to measure the impact of serialized JSON on performance. If your Time to Interactive explodes due to a massive JSON payload, optimization is needed — either by lazy-loading certain data or moving state server-side (sessions, cookies).

Finally, test with varied User-Agents. Google says it ignores JSON, but what about Googlebot-Mobile vs Desktop? Third-party bots? A quick test with curl -A "Googlebot" will show you exactly what the crawler receives.

Ensure your serialized JSON is in

🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.

SEO Claims collects, analyzes and translates official Google statements about search engine optimization, sourced from published articles and YouTube videos by Google Search Central. Each statement is enriched with AI analysis, classified by SEO category and attributed to its author. An essential tool for SEO professionals who want to know exactly what Google recommends.

Navigation

Statements Labs SEO Authors Sitemap Top SEO Agencies Legal Notice

Resources

Google Search Console PageSpeed Insights Rich Results Test Lighthouse Google Search Guidelines All Google Tools →

Semantic

AI & SEO 9673 Content 5585 Domain Name 1943 PDF & Files 497 Discover & News 343

Technical

Domain Age & History 6840 Crawl & Indexing 3560 JavaScript & Technical SEO 2358 Search Console 1848 Web Performance 105

Authority

Links & Backlinks 2076 Social Media 541 Penalties & Spam 515 Algorithms 416 Local Search 116

Latest Google statements on SEO

Apr 2026 John Mueller Pourquoi personne ne peut vraiment maîtriser le SEO à 100% ? Apr 2026 John Mueller Peut-on vraiment se permettre de faire n'importe quoi en SEO sans conséq… Apr 2026 Martin Splitt Google utilise-t-il des scripts JavaScript personnalisés pour évaluer vo… Apr 2026 Gary Illyes Faut-il vraiment maîtriser SQL et BigQuery pour faire du SEO en 2025 ? Apr 2026 Martin Splitt Faut-il vraiment respecter la limite de 100KB pour votre fichier robots.… Apr 2026 Gary Illyes HTTP Archive : Google révèle-t-il enfin comment il analyse vraiment vos … Apr 2026 Martin Splitt BigQuery est-il vraiment indispensable pour analyser vos données SEO à g… Apr 2026 Gary Illyes Pourquoi Google publie-t-il soudainement des données massives sur l'usag…

© 2026 SEO Declarations. All rights reserved. This site is not affiliated with Google. Statements presented are from public Google communications.

Stay ahead

Get a complete real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google SEO statement drops, with full analysis included.

🔒 No spam. Unsubscribe in one click.

Search Categories Recent FR