What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Google caches resources fetched from APIs in the same way as other resources if they use the GET method. However, POST requests are not cached.
31:36
🎥 Source video

Extracted from a Google Search Central video

⏱ 46:02 💬 EN 📅 25/11/2020 ✂ 29 statements
Watch on YouTube (31:36) →
Other statements from this video 28
  1. 1:02 Google rend-il vraiment toutes les pages JavaScript, quelle que soit leur architecture ?
  2. 1:02 Google rend-il vraiment TOUT le JavaScript, même sans contenu initial server-side ?
  3. 2:05 Comment vérifier que Googlebot crawle vraiment votre site ?
  4. 2:05 Comment vérifier que Googlebot est vraiment Googlebot et pas un imposteur ?
  5. 2:36 Google limite-t-il vraiment le temps CPU lors du rendu JavaScript ?
  6. 2:36 Google limite-t-il vraiment le temps CPU lors du rendu JavaScript ?
  7. 3:09 Faut-il arrêter d'optimiser pour les bots et se concentrer uniquement sur l'utilisateur ?
  8. 5:17 La propriété CSS content-visibility impacte-t-elle le rendu dans Google ?
  9. 8:53 Comment mesurer les Core Web Vitals sur Firefox et Safari sans API native ?
  10. 11:00 Combien de temps Google attend-il vraiment avant d'abandonner le rendu JavaScript ?
  11. 11:00 Combien de temps Googlebot attend-il vraiment pour le rendu JavaScript ?
  12. 20:07 Pourquoi Google affiche-t-il des pages vides alors que votre site JavaScript fonctionne parfaitement ?
  13. 20:07 AJAX fonctionne en SEO, mais faut-il vraiment l'utiliser ?
  14. 21:10 Le JavaScript bloquant peut-il vraiment empêcher Google d'indexer tout le contenu de vos pages ?
  15. 24:48 Le prérendu dynamique est-il devenu un piège pour l'indexation ?
  16. 26:25 Pourquoi vos ressources supprimées peuvent-elles détruire votre indexation en prérendu ?
  17. 26:47 Que fait vraiment Google avec votre HTML initial avant le rendu JavaScript ?
  18. 27:28 Google analyse-t-il vraiment tout dans le HTML initial avant le rendu ?
  19. 27:59 Pourquoi Google ignore-t-il le rendu JavaScript si votre balise noindex apparaît dans le HTML initial ?
  20. 27:59 Pourquoi une page 404 avec JavaScript peut-elle faire désindexer tout votre site ?
  21. 28:30 Pourquoi Google refuse-t-il de rendre le JavaScript si le HTML initial contient un meta noindex ?
  22. 30:00 Google compare-t-il vraiment le HTML initial ET rendu pour la canonicalisation ?
  23. 30:01 Google détecte-t-il vraiment le duplicate content après le rendu JavaScript ?
  24. 31:36 Google cache-t-il vraiment les requêtes POST lors du rendu JavaScript ?
  25. 34:47 Est-ce que Google indexe vraiment toutes les pages après rendu JavaScript ?
  26. 35:19 Google rend-il vraiment 100% des pages JavaScript avant indexation ?
  27. 36:51 Pourquoi vos APIs défaillantes sabotent-elles votre indexation Google ?
  28. 37:12 Les données structurées sur pages noindex sont-elles vraiment perdues pour Google ?
📅
Official statement from (5 years ago)
TL;DR

Google caches responses from APIs accessed via GET just like any static resource—images, CSS, JavaScript. However, POST requests are systematically excluded from caching. For SEO, this means that critical content loaded dynamically via GET can become outdated if cache headers are not properly managed. The GET/POST distinction is significant: it directly affects what Googlebot sees and indexes.

What you need to understand

Why is the distinction between GET and POST crucial for crawling?

Googlebot treats GET requests as requests for stable content retrieval. This is the very principle of the semantic web: GET = read, POST = write or action. When a bot crawls a URL, it expects a GET request to always return the same content for the same parameters.

Caching relies on this assumption. If an API responds via GET, Google applies the same rules as for an image or a script: it stores the response based on HTTP headers (Cache-Control, Expires, ETag). A POST request, by nature, modifies a server-side state—impossible to cache without risking inconsistency.

What practically changes for a site loading content via API?

An e-commerce site displaying prices or stock via a GET API might see Googlebot serving outdated data if the cache headers are too generous. The bot reads what it has in memory, not the current state of the catalog.

Conversely, a site that uses POST to retrieve critical content—an absurd practice but seen in the field—ensures that Google will never cache this resource. The result: the content may simply not be indexed if the bot cannot retrieve it reliably.

Do HTTP headers become a strategic issue?

Absolutely. Google strictly adheres to Cache-Control: no-cache, must-revalidate, and expiration directives. If an API responds with a max-age=86400 (24h), the bot will not revisit before that deadline, even if the content has changed in the meantime.

The difference between server cache and bot cache is subtle but critical. The former optimizes load, while the latter determines what Google indexes. A misconfigured header can freeze an outdated version in the index for days.

  • GET = automatic caching if the HTTP headers allow it
  • POST = never cached, even with favorable headers
  • The Cache-Control headers define the lifespan of a GET response in Google's cache
  • Critical content loaded via GET with a max-age that is too long can become outdated in the index
  • Using POST for public content meant for indexing is a major technical error

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it has been confirmed for years by server logs. When analyzing Googlebot requests, it is clear that it does not request a GET resource again as long as its cache is valid. Crawl budget audits show that the bot systematically skips fresh resources.

The trap? Many developers configure public APIs with default max-age inherited from frameworks (often 3600s or more). The result: Googlebot reads outdated data without anyone noticing. This is invisible during human browsing, but critical for indexing.

What unclear areas remain in this statement?

Martin Splitt does not specify how Google handles dynamic parameters in GET URLs. An API called with ?timestamp=xxx technically generates a unique URL for each request—thus no cache. But what about sorting parameters, pagination, or filters? [To verify]

Another blurry point: the behavior with empty responses or 204 No Content. If a GET API returns a 204, does Google cache this absence of content? On headless sites, this is a common pattern to signal that data no longer exists. There is no communication from Google on this case.

In what scenarios does this rule pose a problem?

On ultra-volatile content sites: real-time media, trading platforms, dynamic inventories. A breaking news article loaded via GET API with a 5-minute cache can be outdated in the index during that window.

Even worse: sites doing A/B testing via API. If the GET API returns a variant A and then a variant B based on a cookie, but Google caches the first response, it will never see the other versions. Involuntary cloaking is lurking.

Warning: Critical content for SEO should NEVER rely on a long-cached GET API. Prefer server-side rendering or a very short cache (max 60s) for indexable data.

Practical impact and recommendations

What should be prioritized in an audit of a site using APIs?

Firstly, map out all GET APIs that serve content visible to the user. Not just the obvious endpoints—also internal micro-services, data CDNs, third-party APIs (customer reviews, inventory, pricing).

Secondly, check the actual HTTP headers returned by each endpoint. Not those in the documentation, but what is observed in a curl or via DevTools. An absent Cache-Control often equates to an implicit cache of several hours depending on the server.

How to set up cache headers for indexing?

For content meant to be indexed, the sweet spot is between no-cache (too aggressive, kills crawl budget) and max-age=3600 (too long, risk of obsolescence). A max-age of 60 to 300 seconds is a good compromise for most sites.

Always add must-revalidate or stale-while-revalidate to force Googlebot to re-check freshness. And if the content changes rarely, using an ETag is better—the bot makes a conditional request and conserves bandwidth if nothing has changed.

What technical errors threaten headless implementations?

The classic mistake: loading critical content via POST "for security reasons". This is a misunderstanding of the role of POST. If the data is public and needs to be indexed, it must travel via GET, period.

Another frequent fault: not testing Googlebot rendering with the same cache parameters as production. The URL testing tool in Search Console does not always accurately simulate actual cache behavior. It is necessary to cross-reference with server logs to see what the bot actually retrieves.

  • List all GET APIs serving indexable content
  • Check Cache-Control, Expires, ETag headers on each endpoint
  • Set a max-age between 60 and 300 seconds for dynamic content
  • Add must-revalidate or stale-while-revalidate to force revalidation
  • NEVER use POST to retrieve content intended for indexing
  • Test actual rendering with Google Search Console + analyze server logs
Caching of GET APIs by Google is not a technical detail—it is a cornerstone of modern indexing. A poorly configured headless or JAMstack site can see its content frozen in the index for hours or even days. Fine-tuning HTTP headers becomes a prerequisite for SEO, just like crawl budget or internal linking. These optimizations often require close collaboration between SEO and development teams; if this expertise is lacking internally, support from an SEO agency specialized in modern architectures can prevent costly mistakes and accelerate compliance.

❓ Frequently Asked Questions

Une API GET sans header Cache-Control est-elle quand même mise en cache par Google ?
Oui, en l'absence de directive explicite, Google applique un cache par défaut dont la durée dépend du code HTTP renvoyé (généralement quelques heures pour un 200 OK). Mieux vaut spécifier explicitement un max-age pour contrôler ce comportement.
Si je change le contenu d'une API GET, combien de temps avant que Google indexe la nouvelle version ?
Cela dépend du max-age défini dans les headers. Si vous avez fixé max-age=3600, Googlebot attendra jusqu'à 1 heure avant de revérifier la ressource. Un max-age court (60-300s) réduit ce délai.
Les paramètres d'URL dans une requête GET cassent-ils le cache de Google ?
Chaque combinaison unique de paramètres génère une entrée de cache distincte. ?page=1 et ?page=2 sont deux ressources différentes pour Google, donc deux caches séparés. Attention à la prolifération d'URLs.
Peut-on forcer Google à ignorer le cache d'une API GET ?
Oui, en renvoyant Cache-Control: no-cache ou no-store. Mais cela peut impacter négativement le crawl budget — le bot devra re-télécharger la ressource à chaque passage. À utiliser avec parcimonie.
Un contenu chargé via POST est-il complètement invisible pour Google ?
Pas toujours. Si le contenu s'affiche dans le DOM après le chargement POST et que JavaScript est exécuté, Google peut le voir au rendu. Mais il ne pourra jamais crawler directement l'endpoint POST, ce qui limite la découverte et l'indexation.
🏷 Related Topics
JavaScript & Technical SEO Web Performance

🎥 From the same video 28

Other SEO insights extracted from this same Google Search Central video · duration 46 min · published on 25/11/2020

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.