Official statement
Google requires the noindex attribute to be written without spaces to be recognized. A syntax error as minor as a space turns this attribute into an invalid string, rendering the directive completely ineffective. The engine will index pages you thought were explicitly excluded, with all the implications this has for your crawl budget and your rankings.
What you need to understand
What exact syntax does Google expect for noindex?
The noindex directive must be written as a single word, without spaces, whether in a meta tag (<meta name="robots" content="noindex">) or in an HTTP header (X-Robots-Tag: noindex). Google parses this value strictly: any typographical variation makes it unrecognized.
This syntactical requirement also applies to directive combinations. If you want to combine noindex and nofollow, write noindex, nofollow with a comma and a space between the two directives, but never include a space within the word itself. A no index with a space will be ignored, and the page will be crawled and indexed normally.
Why does this error go unnoticed in production?
HTML remains technically valid even with a space in the attribute's value. No validator will flag the anomaly, as the HTML syntax itself is correct. The issue lies with the semantics of the directive, not the structure of the code.
Google sometimes displays a generic warning in the Search Console saying "Excluded by the noindex tag," but it doesn’t specify whether the directive has been read correctly. If your page continues to appear in the index despite what you believe to be a functioning noindex tag, often a syntax error prevents its interpretation.
What other robot attributes are sensitive to this type of error?
All attributes in the robots family follow the same logic: nofollow, noarchive, nosnippet, noimageindex, unavailable_after. Each must be written as a single word, without spaces or intermediate hyphens, for Googlebot to recognize them.
Combinations are still possible: noindex, nofollow, noarchive works perfectly. But as soon as one of the words contains a space (no follow), the entire directive becomes void. The rest of the string may be valid, but the corrupted element will be ignored.
- The noindex syntax must be written without spaces as a single word.
- A space turns the directive into a string not recognized by Google.
- The error remains invisible in HTML but causes unwanted indexing.
- All robot attributes (nofollow, noarchive, nosnippet) follow the same strict rule.
- HTML validators do not detect this semantic error.
SEO Expert opinion
Is this statement really aligned with field practices?
Absolutely. Site audits regularly reveal incorrectly indexed pages due to micro-syntax variations in robots directives. The "no index" with a space is among the most common mistakes, often introduced by misconfigured CMS or templates copied from unreliable resources.
What is surprising is that Google does not send any explicit error signal in the Search Console for this type of problem. You won't see "misformatted directive"; just the absence of effect. The page remains indexed, and you have to diagnose the root cause by inspecting the source code.
What nuances should be added to this rule?
Case sensitivity differs depending on the directives. noindex, NOINDEX, and NoIndex are all accepted by Google: case does not matter. However, spaces, hyphens, or underscores break the recognition.
Some SEOs test variations like no-index with a hyphen, thinking it might work. It does not work. Google strictly expects noindex, without any separators. [To be verified] on alternative engines (Bing, Yandex): syntactical tolerance may vary, but it's better to apply the strictest standard to avoid any ambiguity.
In what cases can this rule cause problems in production?
Template systems (Twig, Jinja, Blade) or dynamic meta tag generators can introduce stray spaces if variables are not correctly escaped. A junior developer concatenating "no" + " " + "index" thinking they are building the directive will create a silent bug.
WordPress SEO plugins (Yoast, Rank Math, All in One SEO) normally handle the correct syntax, but third-party extensions or custom themes can inject sloppy HTML code. Always verify the final rendering in the DOM, not just in the admin interface.
Practical impact and recommendations
What should you do concretely to avoid this error?
Integrate a systematic quality control of the HTML source code into your deployment pipeline. A simple regex /content="[^"]*\s(noindex|nofollow|noarchive)[^"]*"/ can flag values with spaces before they reach production.
For dynamic sites, create unit tests that verify your meta robots generation functions produce strings without spaces. A Jest or PHPUnit test takes 5 minutes to write and saves you hours of debugging in production.
How can I verify that my site is compliant?
Use the Google Search Console to list all pages marked "Excluded by the noindex tag." Compare this list with your sitemap or database: if pages you thought were excluded appear in the index, inspect their source code.
A crawl with Screaming Frog in "Custom Extraction" mode can extract the exact content of the meta robots tags. Export the column, filter for values containing a space, and correct in bulk. For large sites, a Python script with BeautifulSoup does the job in a few lines.
What errors should I absolutely avoid?
Do not trust WYSIWYG interfaces of CMS for managing robots directives. They often hide the actual code and can introduce invisible characters (non-breaking spaces, tabs) that you will never see on screen.
Avoid copying and pasting code blocks from forums or unofficial documentation. Code examples found on Stack Overflow or in tutorials may contain subtle syntax errors. Always refer to the official Google documentation or write your tags manually.
- Always write
noindexas a single word, without spaces or hyphens. - Test the final HTML rendering in the browser (DOM inspection) after deployment.
- Create automated tests to validate the syntax of robots directives.
- Regularly compare Google's index with your list of pages to exclude.
- Use a crawl with custom extraction to audit meta robots values in bulk.
- Train your developers on the exact specifications of robots attributes.
💬 Comments (0)
Be the first to comment.