Are Your JavaScript Links Wrecking Your Crawl Budget, and How Can You Fix It?

Official statement

Use semantic HTML markup for links and ensure your links point to a correct URL. Avoid using pseudo-protocol URLs like 'javascript:' because they are not followed by crawlers. Make sure that links include the href attribute with a valid URL so that search engines can discover and follow them.

3:10

🎥 Source video

Extracted from a Google Search Central video

⏱ 4:48 💬 EN 📅 29/04/2020 ✂ 3 statements

Watch on YouTube (3:10) →

✂ Other statements from this video 2 ▾

1:00 How do internal links really shape the topical relevance of your pages?
4:17 Is it true that URL fragments (#) are killing your crawl budget and how can you fix it?

📅

Official statement from April 29, 2020 (6 years ago)

⚠ A more recent statement exists on this topic Does JavaScript rendering really consume crawl budget? Martin Splitt · May 12, 2020 View statement →

TL;DR

Google emphasizes the importance of using semantic HTML tags for links, with an href attribute pointing to a valid URL. Pseudo-protocols like javascript: are not followed by crawlers, hindering page discovery. Essentially, any link without a valid href means a direct loss of crawl budget and a barrier for your internal linking.

What you need to understand

What Is a Pseudo-Protocol and Why Does It Block Crawling?

A pseudo-protocol like javascript: is not a URL in the strict sense. It’s a client-side instruction that triggers a JavaScript action. When Googlebot encounters a link formatted as <a href='javascript:openPage()'>, it cannot interpret it as a crawlable destination.

The reason? Standard HTTP crawlers only read href attributes containing valid URLs (http://, https://, relative paths). Anything that requires JavaScript execution — even if a modern browser can handle it — represents a layer of abstraction that Googlebot ignores on the first pass. The result: the target page is never discovered via this link, it receives no internal PageRank, and it remains orphaned in the eyes of the engine.

Does This Limitation Only Apply to Googlebot or All Crawlers?

All major crawlers (Bing, Yandex, even specialized bots like Semrush or Ahrefs) share this constraint. They traverse the HTML link graph — not the JavaScript event graph. Even though Google occasionally executes JS via its rendering engine, it does so in two stages: HTML crawl, then delayed rendering.

If your link is not present in the initial HTML, it will never be discovered during crawling. It might appear after rendering, but that's random and resource-heavy. Put simply, you're betting on an uncertain second pass rather than guaranteed discovery.

Why Do Some Sites Still Use JavaScript Links?

Technical legacy, primarily. Older frameworks (jQuery, AngularJS 1.x) generated links via onclick event handlers instead of hrefs. Some developers also thought they could deliberately bypass crawling — for premium content or pagination pages they did not want indexed.

However, this approach creates more problems than it solves. You lose control over your internal linking, you fragment your architecture, and you force Google to guess your intentions. In short, it’s a technical debt that costs dearly in organic visibility.

Crawlers only follow valid HTML hrefs — not pseudo-protocols like javascript:
Links without href break internal linking and prevent PageRank transmission
Delayed JavaScript rendering does not compensate for a faulty HTML architecture
All bots (Google, Bing, Ahrefs, Semrush) share this limitation
Using pseudo-protocols is an SEO technical debt inherited from outdated practices

SEO Expert opinion

Is This Recommendation Consistent with Field Observations?

Absolutely. We still see sites — particularly poorly configured SPAs (Single Page Applications) — where hundreds of internal links are not crawlable. Screaming Frog or Oncrawl audits regularly reveal incomplete link graphs, with orphan pages existing in the XML sitemap but not connected by any internal links.

The issue is that many front-end frameworks (React, Vue) automatically generate JavaScript routers without HTML fallback. If you don't configure server-side rendering (SSR) or proper hydration, your links remain invisible to Googlebot at the time of the initial crawl. And waiting for a second pass is hoping that Google will allocate extra crawl budget — which doesn’t always happen.

What Nuances Should Be Added to This Statement?

Google can execute JavaScript and discover dynamically injected links. But it remains a costly, slow, and unguaranteed process. JS rendering occurs in a separate queue, often days after the initial HTML crawl. If your crawl budget is tight (large site, deep pages), these JS links might simply never be processed.

Another point: even when Google renders the JS, it does not guarantee that all events will trigger. A link hidden behind an onclick without an href can easily go unnoticed if the event isn’t simulated. In short, betting on JS rendering to compensate for shaky HTML is like playing Russian roulette with your indexing.

[To Be Verified] Google has never published clear data on the success rate of JS rendering or the average time between HTML crawl and rendering. We only know that it exists, but not how reliable it is for a typical site.

When Does This Rule Not Apply?

If you are managing a pure web application (like a SaaS in a member zone) where public indexing is not a goal, you can afford to use purely JavaScript links. However, as soon as there is an SEO stake — blog, product pages, landing pages — it’s non-negotiable: every link must have a valid HTML href.

Even for SSR or pre-rendered sites, ensuring that the final HTML contains <a href='...'> is essential. Never assume that "the framework takes care of it" — test with a curl or a headless crawler to see what Googlebot is actually receiving.

Warning: Some WordPress plugins or visual builders still generate buttons with onclick instead of href. Always check your templates, especially after migration or theme changes.

Practical impact and recommendations

What Should You Audit First on Your Site?

Start with a complete crawl using Screaming Frog or Oncrawl in "Spider" mode. Export all links and filter those whose href starts with javascript:, # (empty anchor), or that have no href attribute at all. These links break your internal linking and block the PageRank flow.

Next, compare the number of pages discovered by the crawler with the total number of pages in your XML sitemap or CMS. A significant gap signals orphan pages — often caused by non-crawlable JavaScript links. Identify them and create clean HTML paths to connect them to your structure.

How to Fix Existing JavaScript Links?

The simplest solution: replace your JavaScript links with standard HTML hrefs. If you need to maintain a JS interaction (animation, tracking), keep the href and add an event handler that prevents the default behavior (event.preventDefault()) while executing your logic. This way, the link remains crawlable even if the JS doesn’t load.

For SPAs, configure server-side rendering (SSR) or static pre-rendering. Next.js, Nuxt.js, or services like Prerender.io generate static HTML that Googlebot can crawl directly. Always test with the URL Inspection tool in Search Console to ensure Google sees your links correctly.

What Tools to Use to Validate Link Compliance?

The URL Inspection tool in the Search Console shows you exactly what Googlebot sees. Compare the rendered version with the raw HTML: if your links only appear in the rendered version, it means they rely on JS and that you have a crawlability issue.

Additionally, use headless crawlers like Puppeteer or Playwright to simulate Googlebot behavior with and without JS. This allows you to detect discrepancies and prioritize corrections. Automate these tests in CI/CD to avoid regressions after each deployment.

Crawl your site in Spider mode and identify all links without valid hrefs or with pseudo-protocols
Compare the number of crawled pages with the number of pages in your XML sitemap
Replace href='javascript:' with valid URLs and keep the JS as a complement if necessary
Set up SSR or pre-rendering for SPAs (Next.js, Nuxt, Prerender.io)
Always test using the URL Inspection tool in the Search Console
Automate crawlability tests in CI/CD to prevent regressions

Fixing JavaScript links is an absolute priority for any site with an SEO stake. It's a technical intervention that sometimes touches the very architecture of your front-end — templates, routers, frameworks. If your team lacks internal expertise or if the task seems too complex, enlisting a specialized technical SEO agency can expedite diagnosis and compliance, while avoiding costly mistakes that could impact your indexing long-term.

❓ Frequently Asked Questions

Est-ce que Googlebot peut suivre un lien avec href="#" ?

Non. Un href vide (#) ou pointant vers une ancre sans destination ne contient pas d'URL valide. Googlebot l'ignore complètement, même si du JavaScript modifie le comportement côté client.

Les liens onclick sans href transmettent-ils du PageRank ?

Non. Sans attribut href contenant une URL, il n'y a pas de lien au sens HTML. Googlebot ne peut ni le suivre ni transmettre de PageRank via cet élément.

Le rendu JavaScript de Google compense-t-il l'absence de href HTML ?

Partiellement, mais c'est aléatoire et coûteux. Le rendu JS intervient après le crawl initial, parfois avec des jours de décalage, et n'est pas garanti pour toutes les pages. Mieux vaut ne pas compter dessus.

Peut-on utiliser des liens JavaScript pour du contenu privé ou non indexable ?

Oui, si l'objectif est justement d'empêcher le crawl (zone membre, contenu premium). Mais pour du contenu public avec enjeu SEO, c'est une erreur technique grave.

Comment vérifier rapidement si mes liens sont crawlables ?

Utilisez l'outil d'inspection d'URL de la Search Console et comparez le HTML brut avec la version rendue. Si vos liens n'apparaissent que dans la version rendue, ils dépendent du JS et posent problème.

🏷 Related Topics

Domain Age & History Crawl & Indexing JavaScript & Technical SEO Links & Backlinks Domain Name

🎥 From the same video 2

Other SEO insights extracted from this same Google Search Central video · duration 4 min · published on 29/04/2020

🎥 Watch the full video on YouTube →

Related statements

« Previous

Google's Management of Canonical URLs...

Precautions with the 'noscript' attribute...

« Back to results