Are GET APIs really cached by Google just like any other resource?

Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Google caches resources fetched from APIs in the same way as other resources if they use the GET method. However, POST requests are not cached.

31:36

🎥 Source video

Extracted from a Google Search Central video

⏱ 46:02 💬 EN 📅 25/11/2020 ✂ 29 statements

Watch on YouTube (31:36) →

✂ Other statements from this video 28 ▾

📅

Official statement from November 25, 2020 (5 years ago)

⚠ A more recent statement exists on this topic Why Does Google Search Console Show Poor LCP When Your Pages Seem Fast? Google · October 7, 2025 View statement →

TL;DR

Google caches responses from APIs accessed via GET just like any static resource—images, CSS, JavaScript. However, POST requests are systematically excluded from caching. For SEO, this means that critical content loaded dynamically via GET can become outdated if cache headers are not properly managed. The GET/POST distinction is significant: it directly affects what Googlebot sees and indexes.

What you need to understand

Why is the distinction between GET and POST crucial for crawling?

Googlebot treats GET requests as requests for stable content retrieval. This is the very principle of the semantic web: GET = read, POST = write or action. When a bot crawls a URL, it expects a GET request to always return the same content for the same parameters.

Caching relies on this assumption. If an API responds via GET, Google applies the same rules as for an image or a script: it stores the response based on HTTP headers (Cache-Control, Expires, ETag). A POST request, by nature, modifies a server-side state—impossible to cache without risking inconsistency.

What practically changes for a site loading content via API?

An e-commerce site displaying prices or stock via a GET API might see Googlebot serving outdated data if the cache headers are too generous. The bot reads what it has in memory, not the current state of the catalog.

Conversely, a site that uses POST to retrieve critical content—an absurd practice but seen in the field—ensures that Google will never cache this resource. The result: the content may simply not be indexed if the bot cannot retrieve it reliably.

Do HTTP headers become a strategic issue?

Absolutely. Google strictly adheres to Cache-Control: no-cache, must-revalidate, and expiration directives. If an API responds with a max-age=86400 (24h), the bot will not revisit before that deadline, even if the content has changed in the meantime.

The difference between server cache and bot cache is subtle but critical. The former optimizes load, while the latter determines what Google indexes. A misconfigured header can freeze an outdated version in the index for days.

GET = automatic caching if the HTTP headers allow it
POST = never cached, even with favorable headers
The Cache-Control headers define the lifespan of a GET response in Google's cache
Critical content loaded via GET with a max-age that is too long can become outdated in the index
Using POST for public content meant for indexing is a major technical error

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it has been confirmed for years by server logs. When analyzing Googlebot requests, it is clear that it does not request a GET resource again as long as its cache is valid. Crawl budget audits show that the bot systematically skips fresh resources.

The trap? Many developers configure public APIs with default max-age inherited from frameworks (often 3600s or more). The result: Googlebot reads outdated data without anyone noticing. This is invisible during human browsing, but critical for indexing.

What unclear areas remain in this statement?

Martin Splitt does not specify how Google handles dynamic parameters in GET URLs. An API called with ?timestamp=xxx technically generates a unique URL for each request—thus no cache. But what about sorting parameters, pagination, or filters? [To verify]

Another blurry point: the behavior with empty responses or 204 No Content. If a GET API returns a 204, does Google cache this absence of content? On headless sites, this is a common pattern to signal that data no longer exists. There is no communication from Google on this case.

In what scenarios does this rule pose a problem?

On ultra-volatile content sites: real-time media, trading platforms, dynamic inventories. A breaking news article loaded via GET API with a 5-minute cache can be outdated in the index during that window.

Even worse: sites doing A/B testing via API. If the GET API returns a variant A and then a variant B based on a cookie, but Google caches the first response, it will never see the other versions. Involuntary cloaking is lurking.

Warning: Critical content for SEO should NEVER rely on a long-cached GET API. Prefer server-side rendering or a very short cache (max 60s) for indexable data.

Practical impact and recommendations

What should be prioritized in an audit of a site using APIs?

Firstly, map out all GET APIs that serve content visible to the user. Not just the obvious endpoints—also internal micro-services, data CDNs, third-party APIs (customer reviews, inventory, pricing).

Secondly, check the actual HTTP headers returned by each endpoint. Not those in the documentation, but what is observed in a curl or via DevTools. An absent Cache-Control often equates to an implicit cache of several hours depending on the server.

How to set up cache headers for indexing?

For content meant to be indexed, the sweet spot is between no-cache (too aggressive, kills crawl budget) and max-age=3600 (too long, risk of obsolescence). A max-age of 60 to 300 seconds is a good compromise for most sites.

Always add must-revalidate or stale-while-revalidate to force Googlebot to re-check freshness. And if the content changes rarely, using an ETag is better—the bot makes a conditional request and conserves bandwidth if nothing has changed.

What technical errors threaten headless implementations?

The classic mistake: loading critical content via POST "for security reasons". This is a misunderstanding of the role of POST. If the data is public and needs to be indexed, it must travel via GET, period.

Another frequent fault: not testing Googlebot rendering with the same cache parameters as production. The URL testing tool in Search Console does not always accurately simulate actual cache behavior. It is necessary to cross-reference with server logs to see what the bot actually retrieves.

List all GET APIs serving indexable content
Check Cache-Control, Expires, ETag headers on each endpoint
Set a max-age between 60 and 300 seconds for dynamic content
Add must-revalidate or stale-while-revalidate to force revalidation
NEVER use POST to retrieve content intended for indexing
Test actual rendering with Google Search Console + analyze server logs

Caching of GET APIs by Google is not a technical detail—it is a cornerstone of modern indexing. A poorly configured headless or JAMstack site can see its content frozen in the index for hours or even days. Fine-tuning HTTP headers becomes a prerequisite for SEO, just like crawl budget or internal linking. These optimizations often require close collaboration between SEO and development teams; if this expertise is lacking internally, support from an SEO agency specialized in modern architectures can prevent costly mistakes and accelerate compliance.

❓ Frequently Asked Questions

Une API GET sans header Cache-Control est-elle quand même mise en cache par Google ?

Oui, en l'absence de directive explicite, Google applique un cache par défaut dont la durée dépend du code HTTP renvoyé (généralement quelques heures pour un 200 OK). Mieux vaut spécifier explicitement un max-age pour contrôler ce comportement.

Si je change le contenu d'une API GET, combien de temps avant que Google indexe la nouvelle version ?

Cela dépend du max-age défini dans les headers. Si vous avez fixé max-age=3600, Googlebot attendra jusqu'à 1 heure avant de revérifier la ressource. Un max-age court (60-300s) réduit ce délai.

Les paramètres d'URL dans une requête GET cassent-ils le cache de Google ?

Chaque combinaison unique de paramètres génère une entrée de cache distincte. ?page=1 et ?page=2 sont deux ressources différentes pour Google, donc deux caches séparés. Attention à la prolifération d'URLs.

Peut-on forcer Google à ignorer le cache d'une API GET ?

Oui, en renvoyant Cache-Control: no-cache ou no-store. Mais cela peut impacter négativement le crawl budget — le bot devra re-télécharger la ressource à chaque passage. À utiliser avec parcimonie.

Un contenu chargé via POST est-il complètement invisible pour Google ?

Pas toujours. Si le contenu s'affiche dans le DOM après le chargement POST et que JavaScript est exécuté, Google peut le voir au rendu. Mais il ne pourra jamais crawler directement l'endpoint POST, ce qui limite la découverte et l'indexation.

🏷 Related Topics

cache HTTP APIs crawl budget headers HTTP GET vs POST headless SEO indexation rendering

JavaScript & Technical SEO Web Performance

🎥 From the same video 28

Other SEO insights extracted from this same Google Search Central video · duration 46 min · published on 25/11/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Initial HTML Analysis: Links, Errors, Meta Tags...

Performance Optimization: User-Centric Approach...

« Back to results