Official statement
Other statements from this video 2 ▾
Google emphasizes the importance of using semantic HTML tags for links, with an href attribute pointing to a valid URL. Pseudo-protocols like javascript: are not followed by crawlers, hindering page discovery. Essentially, any link without a valid href means a direct loss of crawl budget and a barrier for your internal linking.
What you need to understand
What Is a Pseudo-Protocol and Why Does It Block Crawling?
A pseudo-protocol like javascript: is not a URL in the strict sense. It’s a client-side instruction that triggers a JavaScript action. When Googlebot encounters a link formatted as <a href='javascript:openPage()'>, it cannot interpret it as a crawlable destination.
The reason? Standard HTTP crawlers only read href attributes containing valid URLs (http://, https://, relative paths). Anything that requires JavaScript execution — even if a modern browser can handle it — represents a layer of abstraction that Googlebot ignores on the first pass. The result: the target page is never discovered via this link, it receives no internal PageRank, and it remains orphaned in the eyes of the engine.
Does This Limitation Only Apply to Googlebot or All Crawlers?
All major crawlers (Bing, Yandex, even specialized bots like Semrush or Ahrefs) share this constraint. They traverse the HTML link graph — not the JavaScript event graph. Even though Google occasionally executes JS via its rendering engine, it does so in two stages: HTML crawl, then delayed rendering.
If your link is not present in the initial HTML, it will never be discovered during crawling. It might appear after rendering, but that's random and resource-heavy. Put simply, you're betting on an uncertain second pass rather than guaranteed discovery.
Why Do Some Sites Still Use JavaScript Links?
Technical legacy, primarily. Older frameworks (jQuery, AngularJS 1.x) generated links via onclick event handlers instead of hrefs. Some developers also thought they could deliberately bypass crawling — for premium content or pagination pages they did not want indexed.
However, this approach creates more problems than it solves. You lose control over your internal linking, you fragment your architecture, and you force Google to guess your intentions. In short, it’s a technical debt that costs dearly in organic visibility.
- Crawlers only follow valid HTML hrefs — not pseudo-protocols like javascript:
- Links without href break internal linking and prevent PageRank transmission
- Delayed JavaScript rendering does not compensate for a faulty HTML architecture
- All bots (Google, Bing, Ahrefs, Semrush) share this limitation
- Using pseudo-protocols is an SEO technical debt inherited from outdated practices
SEO Expert opinion
Is This Recommendation Consistent with Field Observations?
Absolutely. We still see sites — particularly poorly configured SPAs (Single Page Applications) — where hundreds of internal links are not crawlable. Screaming Frog or Oncrawl audits regularly reveal incomplete link graphs, with orphan pages existing in the XML sitemap but not connected by any internal links.
The issue is that many front-end frameworks (React, Vue) automatically generate JavaScript routers without HTML fallback. If you don't configure server-side rendering (SSR) or proper hydration, your links remain invisible to Googlebot at the time of the initial crawl. And waiting for a second pass is hoping that Google will allocate extra crawl budget — which doesn’t always happen.
What Nuances Should Be Added to This Statement?
Google can execute JavaScript and discover dynamically injected links. But it remains a costly, slow, and unguaranteed process. JS rendering occurs in a separate queue, often days after the initial HTML crawl. If your crawl budget is tight (large site, deep pages), these JS links might simply never be processed.
Another point: even when Google renders the JS, it does not guarantee that all events will trigger. A link hidden behind an onclick without an href can easily go unnoticed if the event isn’t simulated. In short, betting on JS rendering to compensate for shaky HTML is like playing Russian roulette with your indexing.
[To Be Verified] Google has never published clear data on the success rate of JS rendering or the average time between HTML crawl and rendering. We only know that it exists, but not how reliable it is for a typical site.
When Does This Rule Not Apply?
If you are managing a pure web application (like a SaaS in a member zone) where public indexing is not a goal, you can afford to use purely JavaScript links. However, as soon as there is an SEO stake — blog, product pages, landing pages — it’s non-negotiable: every link must have a valid HTML href.
Even for SSR or pre-rendered sites, ensuring that the final HTML contains <a href='...'> is essential. Never assume that "the framework takes care of it" — test with a curl or a headless crawler to see what Googlebot is actually receiving.
Practical impact and recommendations
What Should You Audit First on Your Site?
Start with a complete crawl using Screaming Frog or Oncrawl in "Spider" mode. Export all links and filter those whose href starts with javascript:, # (empty anchor), or that have no href attribute at all. These links break your internal linking and block the PageRank flow.
Next, compare the number of pages discovered by the crawler with the total number of pages in your XML sitemap or CMS. A significant gap signals orphan pages — often caused by non-crawlable JavaScript links. Identify them and create clean HTML paths to connect them to your structure.
How to Fix Existing JavaScript Links?
The simplest solution: replace your JavaScript links with standard HTML hrefs. If you need to maintain a JS interaction (animation, tracking), keep the href and add an event handler that prevents the default behavior (event.preventDefault()) while executing your logic. This way, the link remains crawlable even if the JS doesn’t load.
For SPAs, configure server-side rendering (SSR) or static pre-rendering. Next.js, Nuxt.js, or services like Prerender.io generate static HTML that Googlebot can crawl directly. Always test with the URL Inspection tool in Search Console to ensure Google sees your links correctly.
What Tools to Use to Validate Link Compliance?
The URL Inspection tool in the Search Console shows you exactly what Googlebot sees. Compare the rendered version with the raw HTML: if your links only appear in the rendered version, it means they rely on JS and that you have a crawlability issue.
Additionally, use headless crawlers like Puppeteer or Playwright to simulate Googlebot behavior with and without JS. This allows you to detect discrepancies and prioritize corrections. Automate these tests in CI/CD to avoid regressions after each deployment.
- Crawl your site in Spider mode and identify all links without valid hrefs or with pseudo-protocols
- Compare the number of crawled pages with the number of pages in your XML sitemap
- Replace
href='javascript:'with valid URLs and keep the JS as a complement if necessary - Set up SSR or pre-rendering for SPAs (Next.js, Nuxt, Prerender.io)
- Always test using the URL Inspection tool in the Search Console
- Automate crawlability tests in CI/CD to prevent regressions
❓ Frequently Asked Questions
Est-ce que Googlebot peut suivre un lien avec href="#" ?
Les liens onclick sans href transmettent-ils du PageRank ?
Le rendu JavaScript de Google compense-t-il l'absence de href HTML ?
Peut-on utiliser des liens JavaScript pour du contenu privé ou non indexable ?
Comment vérifier rapidement si mes liens sont crawlables ?
🎥 From the same video 2
Other SEO insights extracted from this same Google Search Central video · duration 4 min · published on 29/04/2020
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.