Can you really tag structured data that doesn’t match the visible content?

Quick SEO Quiz

Test your SEO knowledge in 5 questions

Less than a minute. Find out how much you really know about Google search.

🕒 ~1 min 🎯 5 questions

Official statement

Structured data markup must match your page's content exactly to avoid disappointing users who check the content.

4:16

🎥 Source video

Extracted from a Google Search Central video

⏱ 56:46 💬 EN 📅 13/12/2016 ✂ 10 statements

Watch on YouTube (4:16) →

✂ Other statements from this video 9 ▾

□ Faut-il vraiment privilégier JSON-LD pour vos données structurées ?
2:11 Pourquoi Google n'affiche-t-il pas vos extraits enrichis malgré un balisage valide ?
2:41 Pourquoi l'outil de test des données structurées ne détecte-t-il pas vos erreurs de politique ?
5:17 Pourquoi Google Search Console reste-t-il l'outil incontournable pour diagnostiquer les erreurs de données structurées ?
6:12 Faut-il vraiment appliquer le balisage produit uniquement aux pages individuelles ?
10:29 Faut-il vraiment indiquer l'origine des avis clients sur votre site ?
31:25 Les propriétés sameAs boostent-elles vraiment votre SEO local et votre Knowledge Graph ?
41:39 Comment Google traite-t-il les signalements de spam sur les extraits enrichis ?
47:01 Faut-il vraiment limiter le balisage schema.org identique sur plusieurs pages ?

📅

Official statement from December 13, 2016 (9 years ago)

⚠ A more recent statement exists on this topic Should you really align structured data and visible content to avoid penalties? John Mueller · March 19, 2019 View statement →

TL;DR

Google reminds that structured data markup must accurately reflect the visible content of the page. Any discrepancy between what the user sees and what the schemas describe risks manual action or loss of eligibility for rich snippets. This rule forces SEOs to navigate between the desire to optimize SERP display and the obligation to remain true to actual content.

What you need to understand

What does it really mean to 'match the content'?

Google requires that every schema.org marked property actually exists in the visible content seen by the user. If you declare a price, rating, date, or author in JSON-LD, the user must be able to find it on the page without digging through the source code.

This matching leaves no room for strategic approximations. Marking a product as €49 when the displayed price is €49.90, declaring a collective author without any name appearing, or inventing an invisible aggregate rating: all of these expose you to penalties. The question isn’t whether Google can technically detect the gap, but how long it will take before a manual reviewer or a competitor reports it.

Why is Google enforcing this rule now?

Because the abuse of misleading structured data erodes trust in rich results. When a user clicks on a recipe displaying 5 stars and discovers a page without reviews, or when a 'in stock' product turns out to be unavailable, the user punishes Google, not just the site.

Structured data is not an alternative marketing channel. They serve to accurately represent existing content to facilitate its algorithmic understanding. Google has no obligation to convert your schemas into rich snippets, and history shows that it regularly removes eligibility from sites that play with the limits.

What leeway do practitioners have?

Less than we would hope. Google does not specify whether the exact wording matters or if semantic equivalence is enough. If your page says, 'Written by Marie Dupont' and your schema states 'Marie D.', is that compliant? Probably. If you mark 'Marie Dupont' while the page only displays 'The editorial team'? No.

The gray area mainly concerns the aggregation and synthesis of scattered data. Can you tag an average rating calculated from 10 reviews present at the bottom of the page? Yes, if those reviews are visible. Can you declare a price when only a 'See price' button exists without direct display? That's risky.

Check for the textual presence of each marked property in the visible DOM (not just in display:none or aria-hidden).
Document your markup choices so you can justify the match in case of manual reconsideration.
Test with the rich results testing tool AND in real user navigation to spot inconsistencies.
Prioritize caution on optional properties: not marking is better than marking incorrectly.
Monitor Search Console for manual action alerts related to structured data (they are rare but definitive).

SEO Expert opinion

Does this statement align with observed practices?

Yes, and it aligns with documented manual actions over the years. Known cases of penalties for misleading structured data mostly affect e-commerce sites (false promotions, nonexistent stocks) and review sites (manipulated ratings, nonexistent reviews).

What is missing from this statement: no tolerance metrics. Google does not state whether a 5% discrepancy on a price, a minor date divergence, or a slightly different wording triggers an action. [To verify]: the exact thresholds remain opaque, and field feedback suggests that detection combines algo + manual reporting.

In what situations does this rule create practical conflicts?

The classic problem: dynamic and personalized data. If your site displays geolocated prices, stock varying by user, or A/B tested content, which version should you mark? Google does not explicitly respond, but the directive principle remains: tag what the average user sees by default.

Another tension: client-side generated rich content. If your reviews, prices, or availability are loaded in JavaScript after the initial render, the crawler may see a markup without corresponding visible content. Technically compliant if the JS executes well for Googlebot, but fragile.

Attention: Multilingual sites that centralize their schemas in JSON-LD must adapt the content of each property to the page’s language. Marking an author in English on a French page violates the rule even if the information is equivalent.

What are the gray areas that Google does not clarify?

First gray area: implicit or calculated data. Can you markup an estimated reading time absent from the content? A deduced difficulty level but not displayed? Google does not decide, but experience shows that the more central the property is for SERP display (price, rating, availability), the stricter the visibility requirement.

Second gray area: temporal granularity. If your event shows 'Saturday, March 15' and your schema specifies '2025-03-15T20:00:00', is that compliant? Probably yes. If the page mentions no date and you invent a consistent one? No. The arbitration between the two remains unclear.

Practical impact and recommendations

How to audit the compliance of your current structured data?

Start by extracting all your JSON-LD, Microdata, or RDFa schemas with a crawler (Screaming Frog, Oncrawl, or even a Python BeautifulSoup script). Compile a comprehensive list of the properties used: prices, ratings, authors, dates, availability, descriptions.

For each property, manually check its presence in the visible content on a representative sample of pages. Do not rely solely on the code: open the pages in private browsing, disable JavaScript if necessary, and ensure that the average user can find the marked information.

What corrections should be prioritized?

Focus first on the properties that directly feed into rich snippets: Product (price, availability, aggregateRating), Recipe (recipeYield, cookTime, aggregateRating), Event (startDate, location), Article (author, datePublished). These are the ones that trigger manual actions in cases of abuse.

Next, correct formulation inconsistencies. If your page says 'From €99' and your schema declares price: 99, add the priceRange property or modify the display. If an author is collective ('Team X'), do not markup a fictitious individual.

How to prevent future deviations?

Integrate a structured validation into your editorial and technical workflow. Every new template, every schema change, every type of content must go through a compliance checklist before deployment. Automate what can be: scripts comparing tags and visible DOM, alerts on detected discrepancies.

Train your editorial and technical teams on the philosophy of structured data: reflect, do not invent. A writer who forgets to display a price should not compensate by hiding the markup. A developer implementing a schema must systematically check the source of each property.

These optimizations require fine coordination between editorial, technical, and SEO teams. If your organization lacks internal resources to audit, correct, and monitor structured compliance at scale, support from a specialized SEO agency can accelerate compliance and securely establish your rich snippets.

Crawl all present schemas and list properties by page type
Manually check the visibility of each critical property (price, ratings, authors, dates)
Remove or correct invisible, approximate, or non-visible calculated properties
Test modified pages in Google’s rich results testing tool
Set up Search Console monitoring to detect manual action alerts
Document internal markup rules to harmonize practices between teams

Google does not negotiate on the match between markup and visible content. Any discrepancy exposes you to losing eligibility for rich snippets or facing manual action. Audit and correction must prioritize SERP-critical properties, and continuous monitoring becomes essential to prevent regressions.

❓ Frequently Asked Questions

Peut-on baliser des informations présentes uniquement dans un PDF téléchargeable lié depuis la page ?

Non. Les données structurées doivent correspondre au contenu visible directement sur la page HTML. Un PDF externe ne compte pas comme contenu accessible à l'utilisateur dans le contexte de la page balisée.

Si un prix s'affiche après clic sur un bouton « Voir le prix », peut-on le baliser en schema ?

C'est risqué. Si le prix n'est pas visible au chargement initial de la page, Google peut considérer qu'il n'est pas accessible directement. Préférez afficher le prix par défaut ou ne pas le baliser.

Les actions manuelles pour données structurées trompeuses sont-elles réversibles ?

Oui, mais elles nécessitent une demande de réexamen dans Search Console après correction complète. Google vérifie manuellement, et le délai peut aller de quelques jours à plusieurs semaines. Aucune garantie de rétablissement.

Peut-on baliser une note moyenne calculée si les avis individuels sont visibles mais sans affichage de la moyenne ?

Techniquement délicat. Si les avis individuels sont présents et que la moyenne est calculable, beaucoup de praticiens la balisent. Mais Google peut exiger un affichage explicite de cette moyenne pour éviter toute ambiguïté.

Comment Google détecte-t-il les incohérences entre schemas et contenu visible ?

Combinaison d'algorithmes de parsing DOM/schema et de reviews manuelles déclenchées par signalements ou audits aléatoires. Les écarts flagrants (prix absent, notes inventées) sont détectés automatiquement. Les nuances fines passent souvent inaperçues sauf signalement.

🏷 Related Topics

données structurées schema markup rich snippets action manuelle SERP contenu visible éligibilité balisage

Domain Age & History Content

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 56 min · published on 13/12/2016

🎥 Watch the full video on YouTube →

Related statements

« Previous

Alt tags and unranked images...

No Guarantee of Display for Rich Snippets...

« Back to results