Does Google really follow links on your noindex pages?

Official statement

Google does not guarantee that links on noindex pages will never be followed. It is possible that Google may temporarily crawl links even if the page is noindexed, especially if the links have a high number of internal references.

2:00

🎥 Source video

Extracted from a Google Search Central video

⏱ 55:44 💬 EN 📅 02/05/2019 ✂ 10 statements

Watch on YouTube (2:00) →

✂ Other statements from this video 9 ▾

5:37 Faut-il vraiment laisser la pagination indexée sur les gros sites ?
8:45 Le maillage interne peut-il vraiment remplacer une architecture de site optimisée ?
11:00 Les PDF sans navigation interne nuisent-ils vraiment à votre indexation ?
38:48 Pourquoi Google affiche-t-il dans Search Console des backlinks que vous avez désavoués ?
43:33 Faut-il vraiment un robots.txt spécifique pour apparaître dans Google Discover ?
44:46 Comment le flexible sampling résout-il le casse-tête des paywalls pour l'indexation ?
46:13 La vitesse de chargement influence-t-elle vraiment le classement Google ?
47:09 Google News et Discover : même indexation ou deux circuits distincts ?
50:44 Les liens entre versions linguistiques d'un site peuvent-ils nuire au ciblage régional ?

What you need to understand

Does noindex really stop Google from crawling links on a page?

Mueller's answer is clear: no, noindex does not block crawling of links. A noindex page will not be indexed, but Googlebot can still explore the URLs it contains.

This behavior is particularly pronounced when the noindex page receives a lot of internal links. Google interprets these linking signals as an indicator of importance: even if you tell it not to index the page, it considers that the links it contains might be worth discovering and following.

Why does Google still explore these links?

Google seeks to discover new URLs and keep its index updated. If a noindex page points to potentially indexable content, Googlebot will follow these links out of caution.

The fact that a page is noindexed does not mean it is without value for the link graph. Google will temporarily crawl these URLs to check if they lead to indexable content, then adjust its behavior based on what it finds.

What does 'temporarily' mean in this context?

Mueller does not provide a precise definition. We can interpret 'temporarily' as an initial or sporadic crawl phase, not a recurrent and intensive crawl.

In practice, this means that Google may discover and visit these URLs during the initial phases of crawling, then reduce their frequency if it finds that they do not lead to anything indexable. But there is no formal guarantee that these URLs will never be crawled.

Noindex blocks indexing, not the crawling of outgoing links.
Noindex pages with many internal links may trigger temporary crawling of these links.
Google never guarantees that a link will not be followed, even on a noindex page.
'Temporarily' remains vague: no precise duration is communicated.
To truly isolate a section, you must combine noindex, robots.txt, and possibly nofollow.

SEO Expert opinion

Is this statement consistent with field observations?

Yes, totally. Server logs regularly show that Googlebot visits URLs pointed to from noindex pages, especially when these pages are heavily linked.

We even observe that some noindex pages consume crawl budget if they have many internal links. Google visits them, notes the noindex, then still explores some of the outgoing links. This is not a bug; it is documented behavior — but rarely explained this clearly by Google.

What nuances should be added to this statement?

Mueller says 'temporarily,' but does not specify the frequency or duration. [To be verified] on your own sites via logs: some noindex pages are crawled for weeks, others for just a few days.

Another critical point: Mueller talks about 'a high number of internal references'. What threshold triggers this behavior? No numerical data. We are in the blur. In practice, a page with 50+ internal links seems to be systematically concerned, but below that, it's less predictable.

In what cases does this rule not apply?

If a noindex page is also blocked by robots.txt, Googlebot will not even be able to load it to extract the links. In this case, the links will never be followed.

Similarly, if you add rel='nofollow' to all links on a noindex page, Google should theoretically ignore these links — but again, this is not guaranteed 100%. Nofollow is a directive, not an absolute command.

Attention: Do not rely solely on noindex to block crawl budget wastage. If you have entire sections of your site that Google should never visit (faceted filters, test pages, staging environments), combine noindex + robots.txt, and verify in the logs that the crawl stops.

Practical impact and recommendations

What practical steps should be taken to control the crawling of noindex pages?

First action: audit your server logs to identify which noindex pages are still being visited by Googlebot and how often. If you see entire sections being crawled despite the noindex, that's a red flag.

Next, adjust your strategy: if a noindex page should never be crawled, add it to robots.txt in Disallow. If you just want it to not be indexed but its links to be occasionally followed, the noindex alone is sufficient.

What mistakes should be absolutely avoided?

Classic mistake: putting critical intermediate pages for the internal linking (categories, thematic hubs) in noindex thinking this will save crawl budget. Result: Google crawls anyway, but you lose visibility of these pages.

Another trap: combining noindex and nofollow on pages that should pass PageRank to target pages. You block indexing AND the passing of link juice, which breaks your SEO architecture.

How to check if your site complies with this logic?

Use Google Search Console to spot URLs that are crawled but not indexed. If you see noindex pages with many internal links, verify in the logs that they are not consuming too much crawl budget.

Compare the crawl volume on these pages before and after optimizing the internal linking. If you reduce internal links to a noindex page, the crawl of its outgoing links should also decrease — but this is not instantaneous.

Audit server logs to detect noindex pages still being crawled.
Combine noindex + robots.txt for sections truly forbidden from crawling.
Do not put critical pages for internal linking in noindex.
Check in Search Console the URLs that are crawled but not indexed.
Test the impact of internal linking on the crawl frequency of links.
Avoid combining noindex + nofollow on strategic intermediate pages.

Noindex does not block link crawling, especially if the page is heavily linked. To master your crawl budget, combine multiple signals (noindex, robots.txt, nofollow) and monitor your logs. These technical optimizations can quickly become complex, especially on large sites with deep architecture. If you lack time or resources to manage these adjustments, support from a specialized SEO agency can save you months and avoid costly mistakes in crawl budget and visibility.

❓ Frequently Asked Questions

Le noindex bloque-t-il le crawl des liens sur une page ?

Non. Le noindex empêche l'indexation de la page, mais Google peut quand même explorer les liens qu'elle contient, surtout si elle reçoit beaucoup de liens internes.

Combien de temps Google crawle-t-il les liens d'une page noindex ?

Mueller parle de « temporairement » sans préciser de durée. En pratique, cela peut durer de quelques jours à plusieurs semaines selon le maillage interne et l'activité du site.

Comment bloquer complètement le crawl d'une page et de ses liens ?

Ajoutez la page au robots.txt en Disallow. Ainsi, Googlebot ne pourra même pas la charger pour en extraire les liens. Le noindex seul ne suffit pas.

Faut-il ajouter nofollow sur les liens d'une page noindex ?

Pas systématiquement. Si la page doit transmettre du PageRank vers des pages cibles, le nofollow casserait cette transmission. Réservez le nofollow aux cas où vous ne voulez vraiment pas que les liens soient suivis.

Une page noindex consomme-t-elle du crawl budget ?

Oui, si elle reçoit beaucoup de liens internes. Google la visite pour vérifier son statut et peut explorer ses liens sortants, ce qui consomme du crawl budget inutilement.

🎥 From the same video 9

Other SEO insights extracted from this same Google Search Central video · duration 55 min · published on 02/05/2019

🎥 Watch the full video on YouTube →