What does Google say about SEO? /
Quick SEO Quiz

Test your SEO knowledge in 3 questions

Less than 30 seconds. Find out how much you really know about Google search.

🕒 ~30s 🎯 3 questions 📚 SEO Google

Official statement

Googlebot attempts to crawl URLs with a clean slate and doesn’t retain any HSTS list. It accesses HTTP URLs directly without applying the browser's HSTS rules.
0:34
🎥 Source video

Extracted from a Google Search Central video

⏱ 1:39 💬 EN 📅 28/10/2020 ✂ 5 statements
Watch on YouTube (0:34) →
Other statements from this video 4
  1. 0:03 Googlebot ignore-t-il vraiment les redirections 307 HSTS ou y a-t-il un piège ?
  2. 0:34 Les redirections 307 HSTS sont-elles réellement invisibles pour le SEO ?
  3. 1:05 Googlebot suit-il vraiment les redirections HTTP vers HTTPS comme un navigateur classique ?
  4. 1:05 Les redirections 307 HSTS peuvent-elles nuire au référencement de votre site ?
📅
Official statement from (5 years ago)
TL;DR

Googlebot crawls HTTP URLs without considering HSTS preload lists maintained by browsers. Specifically, it tries to access URLs directly over HTTP even if your site enforces HTTPS via HSTS. For SEO, this means that your 301 HTTP to HTTPS redirects must remain in place and effective — you can't rely on HSTS to guide the bot.

What you need to understand

What is HSTS and why do browsers use it?

HTTP Strict Transport Security (HSTS) is a security mechanism that forces browsers to always access a site via HTTPS. Once a browser visits a site with HSTS enabled, it remembers this rule and refuses any subsequent HTTP connections — even if the user types http:// in the address bar.

Browsers also maintain preloaded HSTS lists that contain thousands of domains that have requested to be forced to HTTPS on first visit. Chrome, Firefox, and Safari embed these lists directly. As a result, for these domains, the browser never sends an initial HTTP request.

How does Googlebot behave regarding HSTS?

Googlebot operates differently. It retains no HSTS list between crawls. Each URL is crawled with a clean slate, as if the bot has never visited the site before.

Specifically? If Googlebot finds a URL in http:// in its index, in an external link, or in your XML sitemap, it will try to crawl it over HTTP. It will not automatically convert the URL to https:// before the request, unlike modern browsers that apply HSTS.

Why did Google make this technical choice?

The reason relates to crawl neutrality. Googlebot must be able to discover and index the web as it actually exists, with its redirects, configuration errors, and accessible HTTP URLs. If the bot applied HSTS, it would obscure significant configuration issues.

For instance, a site that forgot to set up its 301 redirects but relied on HSTS to enforce HTTPS would go unnoticed. Users with compatible browsers would access HTTPS just fine, but Googlebot — and older browsers — would encounter HTTP. Googlebot therefore tests the raw reality of the server, not a version optimized by the client.

  • Googlebot does not store any HSTS rules between its crawl sessions
  • HTTP URLs in links, sitemaps, or indexes are crawled over HTTP
  • Your 301 HTTP → HTTPS redirects remain essential for SEO
  • HSTS protects end users but does not influence bot behavior
  • This approach reveals server configuration errors that HSTS would mask on the client side

SEO Expert opinion

Is this statement consistent with field observations?

Yes, and it can be empirically verified. Server logs regularly show Googlebot requests over HTTP on domains that are on the HSTS preload list. If you analyze your access logs, you will find hits from Googlebot on port 80 even after years of full HTTPS migration.

This reality contrasts with the behavior of monitoring crawlers (Pingdom, GTmetrix) which often do apply HSTS. Googlebot deliberately maintains a "first principles" approach — it wants to know what the server actually responds to a raw HTTP request, without presuppositions.

What are the implications for incomplete HTTPS migrations?

This is where the issue arises. Many sites that migrated to HTTPS neglected their HTTP redirects, counting on HSTS or the assumption that "nobody uses HTTP anymore." False. Googlebot uses HTTP as the default entry point for many URLs.

If your 301 HTTP → HTTPS redirects are missing, slow, or misconfigured (redirect chains, timeouts), Googlebot loses crawl budget. Worse: it may index outdated HTTP versions or encounter 404 errors where the HTTPS version works. I've seen sites lose 20-30% of their indexed pages after an HTTPS migration because the HTTP redirects were faulty — but invisible to users due to HSTS. [To verify] if your own configuration doesn't have this hidden flaw.

In what cases does this rule have the most impact?

Three critical scenarios. One: sites with a lot of old HTTP backlinks. Googlebot follows these links over HTTP, and if the redirect is broken, the SEO juice is lost. Two: XML sitemaps still containing HTTP URLs by mistake — Googlebot crawls them as they are.

Three: sites with coexisting HTTP and HTTPS versions (yes, this still happens in 2025, especially on intranets or forgotten subdomains). Without proper redirects, Google may index both versions and dilute the ranking. HSTS does not save you from this duplicate content — only server-side 301 redirects do the job.

Attention: If you have migrated to HTTPS and your rankings have dropped for no apparent reason, check first that your HTTP → HTTPS redirects are in place, fast (< 200ms), and without chains. Client-side monitoring tools do not reveal this problem since HSTS hides HTTP from modern browsers.

Practical impact and recommendations

What should you check immediately on your site?

First step: audit your HTTP → HTTPS redirects with a tool that simulates Googlebot (Screaming Frog in Googlebot mode, or curl with Googlebot user-agent). Do not rely on your browser — it applies HSTS and shows you a distorted version of the bot's reality.

Ensure that each HTTP URL returns a 301 Moved Permanently to its HTTPS equivalent, without any intermediate redirect chains. An HTTP → HTTP → HTTPS chain wastes crawl budget and dilutes the PageRank passed. Also test the response speed of these redirects — a slow server on port 80 slows down the entire crawl.

How to audit your server logs for problems?

Analyze your server access logs (Apache/Nginx access logs) by filtering for Googlebot user-agent and HTTP protocol. If you see 404s, 500s, or timeouts on HTTP URLs while their HTTPS versions function properly, you have an invisible problem on the user side.

Also, look for crawl patterns. If Googlebot spends a lot of time on HTTP URLs that all redirect, that's wasted crawl budget. Ideally, update your internal links and sitemap to point directly to HTTPS — this reduces server load and speeds up the crawl. But never remove HTTP redirects, even after years: external backlinks still predominantly point to HTTP.

What mistakes should you absolutely avoid?

Number one error: deactivating port 80 after an HTTPS migration thinking that HSTS is enough. Googlebot will attempt to crawl over HTTP, fail, and gradually deindex your pages. I've seen an e-commerce site lose 60% of its organic traffic in three months after closing port 80 "for security reasons."

Second error: configuring 302 (temporary) redirects instead of 301. Google eventually treats them as 301, but with a delay. Third error: neglecting subdomains or www/non-www versions in HTTP — Googlebot crawls all variants and can create duplicate content if the redirects do not cover all cases.

  • Test all HTTP URLs with curl or Screaming Frog in Googlebot mode
  • Ensure that HTTP → HTTPS redirects return a 301, not a 302
  • Eliminate redirect chains (HTTP → HTTP → HTTPS)
  • Update the XML sitemap to exclusively point to HTTPS
  • Analyze server logs for invisible client-side HTTP errors
  • Keep port 80 open with redirects, never close it completely
Googlebot ignores HSTS and crawls over HTTP by default. Your 301 HTTP → HTTPS redirects must be flawless, fast, and without chains. Regularly audit your server logs to detect client-side invisible problems. If you notice inconsistencies in your crawl or unexplained drops in indexing, a thorough analysis of your HTTPS/HTTP configuration is essential. These technical diagnostics can be complex to conduct alone — hiring a specialized SEO agency for a complete server audit ensures that nothing escapes your attention and that your redirects are optimal for Googlebot.

❓ Frequently Asked Questions

Est-ce que Googlebot applique les règles HSTS entre deux crawls d'un même site ?
Non. Googlebot crawle chaque URL avec une ardoise vierge, sans mémoriser les headers HSTS précédents. Chaque session de crawl ignore les règles HSTS, même si le site les a envoyées lors de crawls antérieurs.
Dois-je conserver mes redirections 301 HTTP vers HTTPS indéfiniment ?
Oui, absolument. Googlebot et les backlinks externes continuent d'utiliser des URLs HTTP. Supprimer ces redirections entraînerait des erreurs 404 et une perte progressive d'indexation.
HSTS préchargé protège-t-il mon site dans les résultats de recherche Google ?
HSTS protège les utilisateurs finaux en forçant HTTPS dans leur navigateur, mais n'affecte pas le crawl de Googlebot. Pour le SEO, seules vos redirections serveur et votre configuration robots.txt/sitemap comptent.
Que se passe-t-il si mon sitemap XML contient des URLs HTTP par erreur ?
Googlebot tentera de les crawler en HTTP. Si vos redirections fonctionnent, il suivra vers HTTPS, mais vous gaspillez du crawl budget. Corrigez le sitemap pour pointer directement en HTTPS.
Comment vérifier que Googlebot accède bien à mes pages en HTTPS ?
Analysez vos logs serveur en filtrant par user-agent Googlebot. Vérifiez que les URLs crawlées sont en HTTPS et que les requêtes HTTP redirigent correctement. La Search Console montre aussi les URLs indexées avec leur protocole.
🏷 Related Topics
Crawl & Indexing HTTPS & Security AI & SEO Domain Name

🎥 From the same video 4

Other SEO insights extracted from this same Google Search Central video · duration 1 min · published on 28/10/2020

🎥 Watch the full video on YouTube →

Related statements

💬 Comments (0)

Be the first to comment.

2000 characters remaining
🔔

Get real-time analysis of the latest Google SEO declarations

Be the first to know every time a new official Google statement drops — with full expert analysis.

No spam. Unsubscribe in one click.