Should you really block JSON files in your robots.txt?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Blocking the crawl of JSON files via robots.txt will prevent the indexing of content that is visible only after rendering on pages that require these JSON files, both on your site and on third-party sites using your APIs.

717:14

🎥 Source video

Extracted from a Google Search Central video

⏱ 912h44 💬 EN 📅 05/03/2021 ✂ 20 statements

Watch on YouTube (717:14) →

✂ Other statements from this video 19 ▾

📅

Official statement from March 5, 2021 (5 years ago)

⚠ A more recent statement exists on this topic Do you really need to migrate your microdata to JSON-LD for structured data? Martin Splitt · March 22, 2022 View statement →

TL;DR

Blocking JSON via robots.txt prevents Google from indexing content that relies on these files after rendering. This rule applies to both your own site and third-party sites using your public APIs. Specifically, if your visible content requires JavaScript loaded via JSON, blocking these resources renders your pages invisible to Google.

What you need to understand

Why does blocking JSON cause indexing problems?<\/h3>
Google operates in two stages: initial crawl<\/strong> and then JavaScript rendering<\/strong>. When Googlebot retrieves your raw HTML, it then initiates a rendering process to execute the JS and load dynamic resources.<\/p>
If your JSON files are blocked in robots.txt, the bot can download your HTML but cannot retrieve the data needed for the final rendering. The result: it indexes an empty or incomplete page, even if everything works visually on the user side.<\/p>

How does this rule impact sites using modern frameworks?<\/h3>
Applications using **React, Vue, or Angular<\/strong> often load their content via JSON API calls. If you block \/api\/*.json, for example, Google will never see the content generated after hydration.<\/p>
This is particularly critical for e-commerce sites<\/strong> where product listings, prices, and availability are loaded dynamically. Without access to the JSON, Google indexes product pages without descriptions or prices — essentially invisible in the results.<\/p>
Are third-party sites using your APIs affected as well?<\/h3>
Yes, and it's less intuitive. If you provide a public API<\/strong> that is consumed by other sites, blocking your JSON endpoints prevents the indexing of content displayed on those third-party sites.<\/p>
Imagine a review aggregator using your API: if you block \/reviews.json, the aggregated content will not be indexable by Google, even if it's not your own site. You indirectly penalize your partners.<\/p>
Blocking via robots.txt<\/strong> applies to all crawlers respecting this file, not just Googlebot<\/li>
Blocked JSON files<\/strong> are not rendered, hence the dependent content remains invisible to the index<\/li>
This rule concerns<\/strong> both your site and third-party sites consuming your public APIs<\/li>
Recommended alternative<\/strong>: only block JSON containing sensitive data, never those used to display public content<\/li><\/ul>**

SEO Expert opinion

Does this statement truly reflect observed behavior in the field?<\/h3>
Yes, absolutely. Technical audits regularly show sites with misconfigured robots.txt<\/strong> blocking \/wp-json\/, \/api\/, or \/\*.json out of excessive caution.<\/p>
The problem is that many developers believe they are "protecting" their data by blocking these endpoints without realizing they are sabotaging their own indexing. I've seen Shopify stores lose 40% of their organic traffic after mistakenly blocking their collection JSONs.<\/p>
Are there cases where blocking JSON remains legitimate?<\/h3>
Of course. If your JSON contains sensitive data<\/strong> (user info, B2B pricing, internal stock), it should be blocked — but then, do not use it to display indexable public content.<\/p>
The distinction is simple: JSON used for client-side rendering<\/strong> of visible content = do not block. Purely backend or admin JSON = it's up to you. [To verify]<\/strong>: Google has never specified whether authentication mechanisms (tokens, headers) are sufficient to circumvent this issue without blocking in robots.txt.<\/p>
What is the acceptable margin of error in this configuration?<\/h3>
None. Unlike other SEO signals where you can compensate (weak backlinks but excellent content), blocking a critical JSON equates to making your page invisible<\/strong>. It’s binary.<\/p>
Always test your robots.txt modifications with Search Console > URL Inspection > Test live URL<\/strong>. If the rendered output is empty while your page functions normally, you've blocked an essential resource.<\/p>
Caution: some CMS platforms (notably WordPress) generate default robots.txt files blocking \/wp-json\/ — check this rule if you use a modern theme loading content via REST API.<\/div>

Practical impact and recommendations

How can you quickly audit your current robots.txt rules?<\/h3>
Download your robots.txt and look for all lines containing .json<\/strong>, \api\/<\/strong>, \data\/<\/strong>, or \content\/<\/strong>. For each Disallow rule found, ask yourself: "Does this file serve to display visible content for users?”<\/p>
Then use the Test robots.txt tool<\/strong> in Search Console. Paste a JSON URL that you suspect is blocked and check if Googlebot can access it. If it’s blocked while that JSON loads your product listings, you’ve found your culprit.<\/p>
What should you do if you discover critical blocked JSON for indexing?<\/h3>
Immediately remove the corresponding Disallow rule in robots.txt. Then, force a quick reindexing<\/strong> via Search Console by requesting inspection of the affected pages.<\/p>
Monitor your server logs in the following days: you should see Googlebot crawling the previously blocked JSONs. If not within 72 hours, it may indicate that this rule wasn’t the only cause (also check HTTP headers, X-Robots-Tag, etc.).<\/p>
What strategy should you adopt to secure your APIs without blocking indexing?<\/h3>
For public data<\/strong> (product listings, articles, reviews), keep JSONs accessible without restriction. For sensitive data, consider using token authentication<\/strong> or serving these JSONs from a non-public subdomain.<\/p>
You can also implement server-side rendering (SSR)<\/strong> or static site generation (SSG) to ensure that critical content is present in the initial HTML, without relying on JavaScript rendering. Less elegant technically, but much more robust from an SEO perspective.<\/p>
Audit robots.txt to identify all rules blocking .json or \/api\/
Test each blocked JSON URL using the Search Console robots.txt tool
Remove Disallow rules affecting JSONs serving visible content
Verify actual rendering with “Test live URL” after modification
Monitor Googlebot logs to confirm the crawling of unblocked JSONs
Consider SSR/SSG to reduce reliance on JavaScript rendering
<\/ul>
Blocking JSON via robots.txt is a common mistake with serious consequences: indexed empty pages, loss of visibility, and traffic decline. Auditing your robots.txt rules should be a priority in any technical SEO diagnosis. If your technical stack relies heavily on JSON APIs and you lack the expertise to secure these flows while preserving indexing, consulting a specialized SEO agency in JavaScript architecture will help you avoid costly mistakes and ensure optimal configuration.<\/div>

❓ Frequently Asked Questions

Bloquer un JSON dans robots.txt affecte-t-il uniquement Googlebot ou aussi les autres moteurs ?

Tous les crawlers respectant robots.txt (Bing, Yandex, etc.) seront impactés. Si vous bloquez un JSON, aucun moteur ne pourra indexer le contenu qui en dépend.

Peut-on bloquer partiellement les JSON, par exemple uniquement pour certains crawlers ?

Oui, robots.txt permet des règles par User-agent. Vous pouvez théoriquement autoriser Googlebot tout en bloquant d'autres bots, mais c'est rarement pertinent pour du contenu public indexable.

Si mon JSON est accessible mais retourne une 401 ou 403, est-ce équivalent à un blocage robots.txt ?

Non. Un code 401/403 signale une restriction d'accès au niveau HTTP, que Google peut interpréter différemment. Robots.txt est un signal explicite de non-crawl volontaire.

Les JSON chargés via fetch() côté client sont-ils concernés par cette règle ?

Oui, absolument. Peu importe la méthode (fetch, XMLHttpRequest, axios), si le JSON est bloqué dans robots.txt, Googlebot ne pourra pas le récupérer lors du rendering.

Comment savoir si mes pages sont indexées avec ou sans le contenu JSON ?

Utilisez l'outil « Inspection d'URL » dans Search Console et comparez le rendu capturé par Google avec votre page réelle. Si des sections entières manquent, vérifiez vos JSON.

🏷 Related Topics
robots.txt indexation rendering JSON JavaScript SEO crawl API ressources bloquées

Domain Age & History Content Crawl & Indexing AI & SEO JavaScript & Technical SEO PDF & Files

🎥 From the same video 19

Other SEO insights extracted from this same Google Search Central video · duration 912h44 · published on 05/03/2021

Pourquoi vos Core Web Vitals mettent-ils 28 jours à se mettre à jour dans Search Console ?

⏱ 27:21

Faut-il vraiment tester ses Core Web Vitals en laboratoire pour éviter les régressions ?

⏱ 36:39

Les animations CSS pénalisent-elles vraiment vos Core Web Vitals ?

⏱ 98:33

Les Core Web Vitals vont-ils encore changer et comment anticiper les prochaines mises à jour ?

⏱ 121:49

Les pages par ville sont-elles vraiment toutes des doorway pages condamnées par Google ?

⏱ 146:15

Le crawl budget dépend-il vraiment de la vitesse de votre serveur ?

⏱ 185:36

Faut-il vraiment commencer petit pour débloquer son crawl budget ?

⏱ 203:58

Faut-il vraiment régénérer vos sitemaps pour retirer les URLs obsolètes ?

⏱ 228:24

Pourquoi Google refuse-t-il de fournir des données Voice Search dans Search Console ?

⏱ 259:19

Comment forcer Google à rafraîchir vos fichiers JavaScript et CSS lors du rendering ?

⏱ 295:52

Comment mapper les URLs et vérifier les redirects en migration pour ne pas perdre le ranking ?

⏱ 317:32

Faut-il vraiment renseigner les dates dans les données structurées ?

⏱ 353:48

Faut-il vraiment modifier la date d'un article à chaque mise à jour ?

⏱ 390:26

Faut-il vraiment limiter le nombre de balises H1 sur une page ?

⏱ 432:21

Les headings ont-ils vraiment autant d'importance que le pense Google ?

⏱ 450:30

Les mots-clés LSI sont-ils vraiment utiles pour le référencement Google ?

⏱ 555:58

Combien de liens par page faut-il pour optimiser le PageRank interne ?

⏱ 585:16

Les requêtes JSON grèvent-elles vraiment votre crawl budget ?

⏱ 674:32

Google peut-il deviner qu'une URL est dupliquée sans même la crawler ?

⏱ 789:13

🎥 Watch the full video on YouTube →

Related statements

Why can't anyone truly master SEO 100%?

John Mueller · Apr 2026 · ★★★

Can we really afford to do anything in SEO without facing consequences?

John Mueller · Apr 2026 · ★★

Is BigQuery really essential for analyzing your SEO data at scale?

Martin Splitt · Apr 2026 · ★★★

Should you really stick to the 100KB limit for your robots.txt file?

Martin Splitt · Apr 2026 · ★★

Does Google use custom JavaScript scripts to evaluate your pages?

Martin Splitt · Apr 2026 · ★★★

Why is Google suddenly sharing massive data on robots.txt usage?

Gary Illyes · Apr 2026 · ★★★

« Previous

Core Web Vitals evaluated by groups of URLs in Sea...

Next »

Relevance Takes Precedence Over Page Experience in...

« Back to results

💬 Comments (0)

Be the first to comment.

Name or alias *

Email (optional, not published)

Your comment *
2000 characters remaining

Comments are moderated before publication.

🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.

SEO Claims collects, analyzes and translates official Google statements about search engine optimization, sourced from published articles and YouTube videos by Google Search Central. Each statement is enriched with AI analysis, classified by SEO category and attributed to its author. An essential tool for SEO professionals who want to know exactly what Google recommends.

Navigation

Statements Labs SEO Authors Sitemap Top SEO Agencies Legal Notice

Resources

Google Search Console PageSpeed Insights Rich Results Test Lighthouse Google Search Guidelines All Google Tools →

Semantic

AI & SEO 9673 Content 5585 Domain Name 1943 PDF & Files 497 Discover & News 343

Technical

Domain Age & History 6840 Crawl & Indexing 3560 JavaScript & Technical SEO 2358 Search Console 1848 Web Performance 105

Authority

Links & Backlinks 2076 Social Media 541 Penalties & Spam 515 Algorithms 416 Local Search 116

Latest Google statements on SEO

Apr 2026 John Mueller Pourquoi personne ne peut vraiment maîtriser le SEO à 100% ? Apr 2026 John Mueller Peut-on vraiment se permettre de faire n'importe quoi en SEO sans conséq… Apr 2026 Martin Splitt Google utilise-t-il des scripts JavaScript personnalisés pour évaluer vo… Apr 2026 Gary Illyes Faut-il vraiment maîtriser SQL et BigQuery pour faire du SEO en 2025 ? Apr 2026 Martin Splitt Faut-il vraiment respecter la limite de 100KB pour votre fichier robots.… Apr 2026 Gary Illyes HTTP Archive : Google révèle-t-il enfin comment il analyse vraiment vos … Apr 2026 Martin Splitt BigQuery est-il vraiment indispensable pour analyser vos données SEO à g… Apr 2026 Gary Illyes Pourquoi Google publie-t-il soudainement des données massives sur l'usag…

© 2026 SEO Declarations. All rights reserved. This site is not affiliated with Google. Statements presented are from public Google communications.

Stay ahead

Get a complete real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google SEO statement drops, with full analysis included.

🔒 No spam. Unsubscribe in one click.

Search Categories Recent FR

Should you really block JSON files in your robots.txt?

Test your SEO knowledge in 3 questions

Already played

Official statement

What you need to understand

SEO Expert opinion

Practical impact and recommendations

❓ Frequently Asked Questions

🎥 From the same video 19

Related statements

💬 Comments (0)

Get real-time analysis of the latest Google SEO declarations