Why are 15% of Google queries completely unknown to the algorithm every day?

Official statement

About 10 to 15% of the queries we encounter every day are new to us, meaning that our algorithms must interpret the intentions behind these queries without manual intervention.

64:52

🎥 Source video

Extracted from a Google Search Central video

⏱ 1h13 💬 EN 📅 27/01/2017 ✂ 10 statements

Watch on YouTube (64:52) →

✂ Other statements from this video 9 ▾

17:00 Les accordéons et onglets sont-ils vraiment pris en compte par Google en mobile-first ?
34:57 Comment savoir si votre site est réellement pénalisé par Google ?
40:14 Pourquoi Google refuse-t-il officiellement le noindex dans le robots.txt ?
46:13 La vitesse de site est-elle vraiment un facteur de classement ou juste un mythe SEO ?
47:44 Faut-il vraiment croiser rel='canonical' et rel='alternate' entre versions desktop et mobile ?
56:03 Faut-il vraiment craindre un afflux massif de backlinks lors d'un lancement de site ?
70:06 Faut-il vraiment renvoyer une 404 plutôt qu'une redirection pour les produits e-commerce disparus ?
75:09 Les redirections automatiques basées sur la langue nuisent-elles à l'indexation multilingue ?
101:09 Les URL dynamiques en JavaScript posent-elles vraiment un problème d'indexation ?

What you need to understand

What does this figure of 10 to 15% really mean?

Google claims that about 10 to 15% of daily queries have never been entered before. This percentage appears stable over time, but its absolute volume is skyrocketing with overall traffic growth. Specifically, among the billions of searches processed every day, hundreds of millions are entirely new queries.

These queries emerge from several factors: recent news, unprecedented linguistic combinations, ultra-specific long-tail questions, or conversationally framed voice searches. Therefore, the algorithm must deduce intent without the benefit of click history, bounce rates, or user satisfaction data for that specific query.

How does Google interpret an unknown query?

When faced with an unseen query, Google utilizes several layers of semantic analysis. The algorithm breaks down the query into recognized entities, analyzes grammatical structure, detects synonyms and variations, and then compares with similar historical queries. Language models like BERT and MUM play a central role here by capturing contextual meaning beyond the exact words used.

The engine also relies on general behavioral signals: the types of content that usually satisfy structurally similar queries, the preferred formats for certain intents (video tutorials, lists, definitions). This interpretation occurs in real time without human intervention, which explains why some SERP results for new queries may seem approximate before stabilization.

What impact does this have on traditional keyword strategies?

This reality invalidates SEO approaches that focus solely on optimizing exact terms identified through tools. If 15% of queries are new every day, no keyword research can anticipate them. The winning strategy is to cover broad semantic fields rather than precise expressions.

Content that performs well responds to families of intents rather than isolated keywords. A well-structured article around a topic naturally addresses multiple angles, variations, and related questions, increasing its likelihood of matching a unique query where the intent aligns with the topic discussed.

15% of daily queries have never been formulated before according to Google
The algorithms interpret intent through semantic analysis and comparison with similar queries
Voice search and conversational phrasing fuel this linguistic diversity
Optimization should target intentions and semantic fields rather than fixed keywords
Models like BERT and MUM enable this contextual understanding in real time

SEO Expert opinion

Does this statement match real-world observations?

On the substance, this figure of 10-15% is consistent with the data observed by practitioners analyzing their internal search logs or Search Console queries. Long-tail queries indeed represent a massive share of organic traffic for most sites, featuring extremely varied formulations for similar intents.

However, Mueller does not specify how Google defines a "new" query. Is it a strict exact match, or does Google consider minor variations (plural, accent, word order) as identical? This nuance dramatically changes the interpretation. [To be verified]: Google has never published a precise methodology for this calculation, which leaves significant room for interpretation.

What limits are there to the automatic interpretation of intent?

The claim that algorithms interpret intent "without manual intervention" is technically true, but it does not mean that this interpretation is always relevant or stable. SEOs regularly observe inconsistent SERP results for ambiguous or emerging queries, where Google struggles before stabilizing the results.

New queries linked to news events pose a particular challenge: the algorithm must quickly decide which content to promote without sufficient historical data. We often see a temporary overrepresentation of general authority sites, even if their content is not optimal, simply because Google prioritizes perceived reliability in the face of uncertainty.

Does this phenomenon change across languages and markets?

Mueller refers to "our algorithms" in a general way, but experience shows significant variations across languages. Languages with complex morphology (German, Slavic languages) inherently generate more query variations. Emerging markets with a high growth of novice internet users also see more unprecedented formulations.

This heterogeneity means that the coverage strategy must adapt to the linguistic and cultural context. A multilingual site cannot simply duplicate its content strategy: it must analyze search patterns specific to each market to anticipate local variations in intent.

Practical impact and recommendations

How can you optimize for queries that haven’t been typed yet?

The traditional keyword research approach remains useful for identifying main volumes, but it must be complemented by an analysis of underlying intents. Identify the questions your audience is asking, even if they do not yet generate measurable volume. Forums, Reddit, Quora, and internal search sessions reveal these emerging formulations.

Structure your content around topic clusters rather than isolated pages targeting a single keyword. A comprehensive pillar content piece that covers a topic in its entirety will naturally capture newly formulated queries that are expressed differently but share the same intent. Use semantic markup (Schema.org) to help Google understand the context of your content.

What common mistakes exacerbate this issue?

Many sites over-optimize for exact terms mechanically repeated, at the expense of semantic richness. This rigid approach limits the content's ability to match query variations. Google now interprets context: natural and comprehensive content performs better than a text stuffed with repetitions of a target keyword.

Another frequent mistake is neglecting natural language questions. Voice search and assistants multiply these conversational formulations. Content that explicitly answers "How to do X?" or "Why does Y happen?" will capture these new queries better than dense technical content without a clear interrogative structure.

How can you measure and adjust your strategy in response to this reality?

Regularly analyze your long-tail queries in Search Console, particularly those with low impressions. They reveal unexpected formulations that generate traffic. Identify common patterns in these variations to enrich your existing content or create new targeted resources.

Monitor bounce rates and session times on long-tail traffic: a high bounce rate indicates that your content, although ranked, does not fully satisfy intent. This may indicate a need for semantic enrichment or restructuring to better meet the variations of intent that Google attempted to match with your page.

Develop comprehensive pillar content covering a subject from all angles
Incorporate natural language formulations and conversational questions
Use Schema.org markup to clarify context and entities
Monthly analyze long-tail queries in Search Console to identify new patterns
Avoid over-optimization for mechanically repeated exact terms in favor of semantic richness
Test different content structures (FAQs, guides, tutorials) to diversify intent coverage

In light of these 15% of daily new queries, your SEO strategy must evolve towards a semantic and intentional approach rather than a mechanical one. This transformation requires deep expertise in intent analysis, content architecture, and understanding Google's language models. If this complexity exceeds your internal resources, consider partnering with a specialized SEO agency to significantly accelerate your results by structuring a strategy adapted to these new algorithmic realities.

❓ Frequently Asked Questions

Comment Google peut-il ranker une page pour une requête jamais vue sans données historiques ?

Google s'appuie sur l'analyse sémantique de la requête (entités, structure, contexte) et la compare à des requêtes similaires historiques. Les modèles de langage comme BERT permettent de comprendre l'intention au-delà des mots exacts, puis l'algorithme sélectionne les contenus qui ont performé pour des intentions comparables.

Ce phénomène de requêtes nouvelles augmente-t-il avec la recherche vocale ?

Oui, la recherche vocale génère des formulations plus conversationnelles et naturelles, souvent uniques. Les utilisateurs posent des questions complètes plutôt que de taper des mots-clés, ce qui multiplie les variations linguistiques et alimente directement ce pourcentage de requêtes inédites.

Faut-il abandonner la recherche de mots-clés traditionnelle ?

Non, mais elle doit être complétée par une analyse d'intentions. Les mots-clés principaux restent essentiels pour cibler les volumes connus, mais votre contenu doit aussi couvrir les champs sémantiques larges pour capter les variations imprévisibles qui représentent 15 % du trafic quotidien.

Les outils de keyword research peuvent-ils anticiper ces requêtes nouvelles ?

Par définition, non. Ces outils analysent l'historique de recherche, donc ne peuvent pas prédire des formulations jamais utilisées. Ils restent utiles pour identifier des tendances émergentes et des questions connexes, mais ne remplaceront jamais une compréhension profonde des intentions de votre audience.

Ce chiffre de 15 % est-il stable dans le temps ?

Google mentionne ce pourcentage depuis plusieurs années, ce qui suggère une stabilité relative. Cependant, le volume absolu explose avec la croissance du trafic global. Cette stabilité du pourcentage masque une complexification continue du paysage des requêtes que les algorithmes doivent gérer.

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 1h13 · published on 27/01/2017

🎥 Watch the full video on YouTube →