Should you really block alternative images with robots.txt instead of using x-robots-tag?

Official statement

To control which versions of images are indexed, Google recommends using the robots.txt file to block alternative versions. While x-robots-tag is primarily for web pages, blocking via robots.txt works for images.

6:59

🎥 Source video

Extracted from a Google Search Central video

⏱ 58:29 💬 EN 📅 21/12/2018 ✂ 13 statements

Watch on YouTube (6:59) →

✂ Other statements from this video 12 ▾

3:13 Les sitemaps d'images sont-ils vraiment nécessaires pour l'indexation ?
4:47 Quelle taille d'image Google privilégie-t-il vraiment dans la recherche d'images ?
10:40 Le cache Google révèle-t-il vraiment ce que voit Googlebot sur votre page JavaScript ?
10:51 Modifier son contenu fait-il forcément baisser le classement Google ?
24:23 Changer de thème WordPress peut-il détruire votre SEO ?
35:30 Pourquoi les redirections 301 page par page sont-elles cruciales lors d'une fusion de sites ?
36:59 Les mentions de marque sans lien transmettent-elles du PageRank ?
46:00 La personnalisation de contenu risque-t-elle d'être considérée comme du cloaking par Google ?
56:56 Pourquoi Google confond-il vos pages régionales avec du contenu dupliqué ?
62:00 Le rendu dynamique reste-t-il indispensable pour les Single Page Applications ?
71:39 Comment supprimer efficacement du contenu dupliqué qui vous pénalise ?
95:40 Les domaines expirés sont-ils vraiment dans le viseur de Google ?

What you need to understand

Why is there a distinction between robots.txt and x-robots-tag for images?

The directive from Mueller is based on a straightforward observation: many websites generate multiple versions of the same image—thumbnails, modern formats (webp, avif), responsive resolutions. Without explicit control, Google can index any of these versions, often not the one you want to highlight.

The x-robots-tag functions through HTTP headers and allows for fine control over indexing. However, Google clarifies that it is “primarily for web pages”. For images, the recommended method becomes the robots.txt file—simpler, more direct, and managed server-side without touching headers.

Which image versions are affected by this recommendation?

This involves all technical variations that your infrastructure automatically generates: different encoding formats, multiple sizes for responsiveness, CDN thumbnails, versions optimized for social media. Each image URL can be crawled and indexed independently.

The problem? If Google indexes your 150x150 thumbnail instead of the original 1200x800, your visibility in Google Images plummets. Or worse: if a non-optimized version (a heavy JPEG instead of a lightweight webp) gets indexed, you lose on two fronts—performance and visual SEO.

How does robots.txt effectively block these alternatives?

The classic syntax applies: you specify the URL patterns to exclude from crawling. For example, if your thumbnails are in /images/thumbs/ and your webp formats are in /images/webp/, you block these folders for Googlebot-Image while allowing /images/originals/.

However, beware—blocking via robots.txt prevents crawling, so Google won't even see these files. It's drastic but effective. If you prefer that Google knows about the existence of these versions without indexing them, you enter a gray area that Mueller does not explicitly cover in this statement.

Robots.txt remains the preferred method for blocking alternative image versions according to Google
The x-robots-tag works, but is not the recommended tool for this specific use case
Blocking via robots.txt means total absence of crawling—the concerned images will be neither seen nor indexed
This approach mainly applies to sites generating multiple formats automatically (CDN, responsive, image conversion)
Without a clear directive, Google can index any version, often not the one you want to promote

SEO Expert opinion

Is this recommendation consistent with practices observed on the ground?

To be honest, most corporate websites I have audited do not use either robots.txt or x-robots-tag to manage their images. They let Google decide, and it works… until it doesn't. Problems arise especially on e-commerce or editorial sites with a high volume of images and aggressive CDNs.

Mueller's directive aligns with what we observe: robots.txt is more stable on Google's side. HTTP headers can be misinterpreted due to cache layers (Cloudflare, Varnish, etc.), while a well-configured robots.txt is read directly by Googlebot without ambiguity. [To verify]—I have never seen Google publish figures on the compliance rate of x-robots-tag for images versus robots.txt.

What nuances should be added to this directive?

Mueller states that the x-robots-tag is “primarily” for web pages. This “primarily” leaves a door open—he does not say “exclusively”. In practice, the x-robots-tag works perfectly for images if your technical stack supports it properly. It’s just not the recommended method.

And this is where it gets tricky: why this preference for robots.txt? Because it is more scalable from Google's perspective. A robots.txt file can cover millions of URLs with a few regex patterns. The x-robots-tag requires each HTTP request to be examined—more costly in server resources and crawl time.

In what cases does this recommendation not really apply?

If you are manually managing each image and do not generate alternative versions automatically, this directive simply does not concern you. The same goes if you only have one image format per content—no webp, no thumbnails, no CDN generating 12 different sizes.

Another exception: sites that want Google to know all versions (for reverse image search reasons or cross-device compatibility) but want to index a preferred version. In this case, using the canonical for images—which Mueller never mentions here—would be more appropriate. But Google has never officially confirmed support for an image canonical. [To verify].

Warning: blocking images via robots.txt can negatively impact your SEO if you accidentally block the right versions. Always test with a small subset before deploying broadly.

Practical impact and recommendations

What should be done concretely to apply this recommendation?

First step: identify all image versions generated by your infrastructure. Crawl your site with Screaming Frog or Oncrawl with image crawling enabled. You will likely discover 3 to 5 times more image URLs than expected—multiple formats, various sizes, CDN versions.

Next, decide which version you want to see indexed. Generally, it’s the high-resolution original or the most efficient modern format (webp, avif). All other versions should be blocked via robots.txt for Googlebot and Googlebot-Image.

What mistakes should be avoided during implementation?

The classic mistake: blocking /images/ when you only wanted to block /images/thumbs/. As a result, all your images disappear from the index. Always systematically test your directives with the robots.txt tester in Search Console before pushing to production.

Another trap—blocking images that are already indexed. Google will gradually remove them from the index, but it takes time. If you want a quick removal, use the URL removal tool in Search Console alongside this. And that’s where it gets complex: managing this transition without breaking existing SEO requires careful planning.

How can you verify that your configuration works correctly?

Use the Search Console, under the Image Coverage section. You should see blocked alternative versions gradually disappearing. Simultaneously, check in Google Images (manual search) that only the right versions appear.

For more precise control, analyze the server logs: Googlebot should receive 403 or completely ignore URLs blocked via robots.txt. If you still see requests with code 200 on images that are supposed to be blocked, your robots.txt is not being applied correctly.

Audit all image versions automatically generated by your infrastructure
Identify which unique version should be indexed (high-resolution original or optimized modern format)
Configure robots.txt with precise patterns to block alternative versions (test with Search Console)
Monitor the progress in the Image Coverage section of Search Console
Manually check in Google Images that only the correct versions appear after a few weeks
Analyze your server logs to confirm that Googlebot is adhering to your robots.txt directives

This technical optimization requires a keen understanding of your server architecture, your CDN, and Google's crawling mechanisms. If your infrastructure automatically generates dozens of image versions and you lack internal technical resources, hiring a specialized SEO agency can help you avoid costly mistakes and significantly accelerate results.

❓ Frequently Asked Questions

Le x-robots-tag ne fonctionne-t-il vraiment pas pour les images ?

Il fonctionne techniquement, mais Google recommande robots.txt comme méthode privilégiée. Le x-robots-tag reste principalement destiné aux pages web selon Mueller.

Que se passe-t-il si je bloque via robots.txt une image déjà indexée ?

Google la retirera progressivement de l'index lors des prochains crawls. Pour un retrait rapide, utilisez en parallèle l'outil de suppression d'URL de la Search Console.

Puis-je bloquer uniquement certains formats d'images (webp, avif) et laisser les JPEG indexables ?

Oui, si vos URLs suivent des patterns distincts (/images/webp/, /images/jpeg/). Robots.txt permet un blocage granulaire par pattern d'URL.

Comment gérer les images servies par un CDN externe avec URLs différentes du domaine principal ?

Vous devez configurer le robots.txt du domaine CDN lui-même. Si vous n'avez pas accès à ce fichier, discutez avec votre fournisseur CDN pour bloquer les versions alternatives côté serveur.

Est-ce que bloquer des images via robots.txt impacte le crawl budget ?

Oui, positivement. En bloquant les versions alternatives inutiles, vous libérez du crawl budget pour les ressources qui comptent vraiment. Google ne perdra plus de temps sur des dizaines de variantes de la même image.

🎥 From the same video 12

Other SEO insights extracted from this same Google Search Central video · duration 58 min · published on 21/12/2018

🎥 Watch the full video on YouTube →