Do you really need a specific robots.txt to appear in Google Discover?

Official statement

No specific directive like a robots.txt is needed for Google Discover. The pages visible in Discover are simply normal pages indexed by Google.

43:33

🎥 Source video

Extracted from a Google Search Central video

⏱ 55:44 💬 EN 📅 02/05/2019 ✂ 10 statements

Watch on YouTube (43:33) →

✂ Other statements from this video 9 ▾

2:00 Google suit-il vraiment les liens sur vos pages noindex ?
5:37 Faut-il vraiment laisser la pagination indexée sur les gros sites ?
8:45 Le maillage interne peut-il vraiment remplacer une architecture de site optimisée ?
11:00 Les PDF sans navigation interne nuisent-ils vraiment à votre indexation ?
38:48 Pourquoi Google affiche-t-il dans Search Console des backlinks que vous avez désavoués ?
44:46 Comment le flexible sampling résout-il le casse-tête des paywalls pour l'indexation ?
46:13 La vitesse de chargement influence-t-elle vraiment le classement Google ?
47:09 Google News et Discover : même indexation ou deux circuits distincts ?
50:44 Les liens entre versions linguistiques d'un site peuvent-ils nuire au ciblage régional ?

What you need to understand

John Mueller's statement seems to aim at clarifying a recurring misunderstanding: some publishers believe there is a specific technical setting — like a dedicated meta tag or robots.txt file — to "activate" Google Discover on their site.

In reality, Mueller reminds us that Discover draws from the standard Google index. No opt-in, no secret whitelist. An indexed page can theoretically be served in a user's personalized feed.

Does Discover work exactly like traditional search?

Yes for technical indexing, no for editorial eligibility. A crawled and indexed URL becomes a candidate, but Discover then applies its own filters: content freshness, expected engagement, alignment with user interests.

The fact that no particular robots.txt is needed doesn’t mean that all indexed pages have the same likelihood of appearing. The difference lies in editorial quality and thematic relevance, not in technical configuration.

Why can this statement be confusing?

Because it says nothing about the real selection criteria of Discover. Saying “you just need to be indexed” implies simplicity — while field observations show a strong selectivity.

Perfectly indexed sites, with a healthy crawl budget and clean architecture, never generate a single Discover click. Conversely, well-positioned news media capture massive volumes.

What does “normal pages indexed by Google” mean?

This phrasing aims to dismystify Discover: there is no special “Discover-ready” status to activate. Pages just need to meet the basics: crawlable, indexable, no noindex tag, accessible via robots.txt.

However, “normal” does not mean “mundane.” A purely informative technical FAQ page is unlikely to appear, whereas a recent news article with quality visuals and AMP structure has a clear advantage.

No specific robots.txt file is required for Discover — standard indexing rules apply.
Technical eligibility (being indexed) does not guarantee editorial eligibility (actually appearing in the feed).
Discover filters the existing index based on criteria of freshness, expected engagement, thematic coherence, and visual quality.
Field observations show a strong selectivity — not all indexed sites generate Discover traffic.
Quality news media and evergreen content with premium visuals are statistically overrepresented in the feed.

SEO Expert opinion

Is this statement consistent with observed practices?

Technically, yes: no robots.txt signal conditions appearance in Discover. But this is a partial truth that masks the essential — post-indexation sorting criteria.

On the ground, it is observed that sites that perform well in Discover share recurring patterns: frequently updated content, strong thematic authority (E-E-A-T), high-resolution images in 16:9 or 1:1, optimal AMP or Core Web Vitals structure. None of this is explicitly stated by Mueller. [To verify]: Google does not communicate a score or threshold for Discover eligibility, making optimization largely empirical.

What nuances should be added to this statement?

To say that it is enough to be indexed ignores the internal algorithmic segmentation. Discover does not treat the index as a homogeneous set: it prioritizes certain types of content (hot news, evergreen with high engagement), certain formats (rich media, embedded video), and certain structures (recognized editorial brand).

Additionally, the statement does not mention the Discover content guidelines published by Google, which explicitly exclude certain categories (sexually explicit content, graphic violence, fake news). Technically indexable does not mean eligible — and Mueller does not make this distinction.

In what cases does this rule not apply or become insufficient?

If your editorial model is based on low freshness content (product sheets, infrequently updated technical guides), you will remain indexed but invisible in Discover, even with a perfect robots.txt.

Similarly, a classic e-commerce site without regular editorial news will likely never generate Discover traffic, no matter how often it is crawled. Mueller's statement does not account for this sector reality. [To verify]: some observers report that Discover favors domains already present in Google News or having a strong history in Top Stories — which would introduce an undocumented bias.

Note: do not confuse technical indexing with Discover visibility. The absence of a robots.txt obstacle guarantees nothing — it is editorial quality, freshness, and engagement that make the difference. If you do not appear in Discover despite clean indexing, the problem lies elsewhere.

Practical impact and recommendations

What steps should be taken to maximize chances in Discover?

First step: ensure that your key pages are crawlable and indexable. No accidental noindex, no robots.txt blocking on strategic sections, clean canonicals. This is the strict minimum — and Mueller has said nothing else.

Next, focus on editorial levers: producing fresh content regularly, structuring your articles with high-resolution visuals (minimum 1200px wide), optimizing titles to maximize the expected CTR, and clearly displaying the author and publication date. Discover prioritizes content it can anticipate will have a high engagement rate.

What mistakes should be avoided when optimizing for Discover?

Don’t try to create a “special Discover” robots.txt file — it doesn’t exist, and you might block the crawling of important pages. Also, don’t overload your pages with unnecessary technical optimizations (complex schema markup, forced AMP) if your editorial content is weak.

Another classic pitfall: neglecting the Discover content guidelines. A technically perfect article but categorized as clickbait, sensationalist, or violating editorial rules will be filtered — and you will never know through Search Console.

How can I check that my site is well configured for Discover?

Check the “Discover” section in Google Search Console: if it appears and displays impressions, even low ones, you are eligible. If it doesn’t appear at all even with solid organic traffic, it is a signal that your content does not meet Discover's editorial criteria.

Additionally, analyze your direct competitors: are they using AMP? Do they publish daily? Do they have an active Google News structure? These elements, while not officially required, correlate strongly with Discover visibility according to field observations.

Check for the absence of noindex or blocking robots.txt directives on strategic pages
Optimize visuals: a minimum of 1200px wide, 16:9 or 1:1 ratio, controlled weight
Regularly publish fresh content with high engagement potential
Clearly display author, publication date, and E-E-A-T signals
Consult the Discover section in Search Console to track performance
Strictly adhere to Discover content guidelines

Mueller's assertion is technically accurate but strategically incomplete. Being indexed is a necessary condition, not sufficient. Discover visibility relies on editorial, freshness, and engagement criteria that Google does not document exhaustively. For sites looking to maximize their presence in this channel — and navigate this gray area — specialized SEO support may be wise: identifying sector success patterns, auditing editorial quality, structuring a content roadmap coherent with Discover's algorithmic expectations requires on-the-ground expertise that few internal teams possess.

❓ Frequently Asked Questions

Dois-je créer un fichier robots.txt spécifique pour Google Discover ?

Non, aucune directive robots.txt particulière n'est nécessaire. Google Discover utilise l'index classique — assurez-vous simplement que vos pages ne sont pas bloquées au crawl.

Pourquoi mon site indexé n'apparaît-il jamais dans Discover ?

L'indexation est une condition nécessaire mais pas suffisante. Discover filtre ensuite selon des critères de fraîcheur, engagement attendu, qualité visuelle et pertinence thématique — critères que Google ne documente pas exhaustivement.

AMP est-il obligatoire pour être éligible à Discover ?

Non, AMP n'est pas obligatoire. Cependant, les observations terrain montrent que les pages AMP ou avec d'excellents Core Web Vitals sont surreprésentées dans le flux, probablement en raison d'un meilleur engagement utilisateur.

Les Guidelines de contenu Discover sont-elles contraignantes ?

Oui, absolument. Google filtre les contenus sensationnalistes, trompeurs, sexuellement explicites ou violents. Même techniquement indexable, un contenu non conforme sera exclu de Discover.

Comment savoir si mon site est éligible à Discover ?

Vérifiez la présence de la section « Discover » dans Google Search Console. Si elle apparaît avec des données (même faibles), vous êtes éligible. Sinon, vos contenus ne répondent probablement pas aux critères éditoriaux.

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 55 min · published on 02/05/2019

🎥 Watch the full video on YouTube →