Official statement
Other statements from this video 5 ▾
- 1:08 Comment Google Safe Browsing détecte-t-il les malwares et impacte-t-il votre référencement ?
- 1:38 Pourquoi les sites légitimes redirigent-ils parfois vers des pages malveillantes sans que vous le sachiez ?
- 2:40 Comment vérifier si un site est vraiment infecté par des malwares selon Google ?
- 4:14 Faut-il vraiment éviter d'ouvrir les pages infectées par des malwares dans un navigateur ?
- 6:18 Comment Google Webmaster Tools détecte-t-il les malwares et faut-il vraiment compter sur sa révision ?
Google suggests using Wget and cURL to identify malicious redirects set up by hackers. These command-line tools simulate HTTP requests and expose hidden redirect chains. However, these tools only capture basic server redirects, not JavaScript tactics or advanced cloaking that require other detection methods.
What you need to understand
Why does Google recommend these tools over other methods?
Wget and cURL are universal command-line utilities that enable the sending of raw HTTP requests without a browser. Their main advantage is that they reveal full response headers, including status codes 301, 302, 307, and destination URLs.
Hackers often exploit conditional redirects that trigger based on user-agent, IP address, or referrer. A standard browser or the Search Console may hide these behaviors, while Wget and cURL provide a complete log of successive redirects and transmitted parameters.
What types of hacks can these tools help detect?
Malicious redirect injections typically aim to divert organic traffic to phishing, pharmaceutical spam, or malware sites. The classic pattern: a compromised .htaccess file redirects GoogleBot or visitors to an external destination, often using a temporary 302 code to avoid alerting.
Wget with the option --max-redirect=10 --spider follows the entire chain without downloading content, revealing each hop. cURL with -I -L displays headers for each step. These tools also detect chained redirects: page A to B, then B to C, obscuring the origin of the hack.
Are these methods effective against all hack scenarios?
No, and that's where the problems arise. Wget and cURL capture only HTTP redirects at the server level. They are blind to JavaScript redirects (window.location, meta refresh), hidden iframes, and sophisticated cloaking that analyzes client fingerprint.
A skilled hacker can serve clean content to a curl/wget user-agent and redirect only real browsers. For such cases, you must combine these tools with headless tools like Puppeteer or rendering services with complete JavaScript crawling.
- Wget and cURL expose raw server HTTP redirects and jump chains.
- They reveal status codes (301, 302, 307) often used in hacks.
- These tools do not detect JavaScript redirects or fingerprinting-based cloaking.
- The command
curl -I -L https://example.comdisplays all redirect headers. - The option
wget --max-redirect=10 --spiderfollows up to 10 jumps without downloading content.
SEO Expert opinion
Is this recommendation sufficient for a complete audit?
Let's be honest: Wget and cURL are a necessary starting point, but far from exhaustive. In reality, the malicious redirects I see in audits fall into two categories: amateur hacks (modified .htaccess files, brutal 302 redirects) and sophisticated attacks targeting GoogleBot with JS cloaking.
CLI tools easily detect the first category. For the second, you must absolutely test with a real GoogleBot user-agent and a JavaScript rendering engine. I've seen clean sites in curl that served pharma spam to Google crawlers. [To be checked] on your own URLs with various user-agents.
What field limitations should you know before using these tools?
The first limitation is that hackers are aware that SEOs test with curl. Some scripts detect the user-agent curl/wget and serve clean behavior, reserving the redirect for standard browsers. Therefore, you need to modify the user-agent with curl -A "Mozilla/5.0..." to simulate a real visitor.
The second pitfall is conditional redirects based on IP. A hacker can redirect only non-data center IPs, making testing from an AWS server ineffective. Test from multiple sources: your local connection, a residential VPN, a mobile proxy. And always check pagination pages and URLs with parameters, which are rarely protected.
In what scenarios can this method lead to false negatives?
Asynchronous JavaScript redirects may completely go unnoticed. For example, a script waits 3 seconds then redirects via setTimeout(() => location.href = ...). Wget and cURL exit immediately after receiving the HTML, never seeing this redirect.
Another classic case is hidden iframes loaded from an external domain. The server responds with 200 OK, the HTML seems clean, but a 1x1 pixel iframe loads malicious content. For such scenarios, you need to parse the DOM with a tool that executes JavaScript and analyzes loaded resources.
Practical impact and recommendations
How can you effectively use Wget and cURL to audit your redirects?
Start by listing your strategic URLs: main pages, category pages, old redirected content. For each URL, run curl -I -L https://example.com/page. The -I option retrieves only the headers, while -L follows redirects. Note each status code and destination URL.
With Wget, use wget --spider --max-redirect=10 --server-response https://example.com/page. The --spider option simulates a crawl without downloading, and --server-response shows all HTTP headers. Compare results with various user agents: curl by default, desktop GoogleBot, mobile GoogleBot, standard browsers.
What mistakes should you absolutely avoid during diagnosis?
The number one mistake is testing only the homepage. Hackers often inject their redirects on deeper pages, URLs with parameters, or old indexed URLs. Export your sitemap and test a random sample of 50-100 URLs, not just the main pages.
The second mistake is ignoring 302 codes. Many SEOs focus solely on 301s, while temporary 302 redirects are a classic signal of hacking. A 302 redirect to an unknown external domain should raise an immediate alert. Also check redirect chains: A to B, then B to C. Any cascade with more than 2 hops deserves scrutiny.
What monitoring strategy should you implement to detect future intrusions?
Automate a cron script that tests your 20-30 key URLs every night with curl and alerts you via email if a status code changes or if a new redirect appears. Log the results in a CSV file to track history. Also combine this with monitoring of your sensitive files: .htaccess, wp-config.php, functions.php.
Integrate a Search Console monitoring as well: sudden spikes in 404 errors, drops in impressions on specific pages, or unknown newly indexed URLs often signal a hack. Cross-check these alerts with your curl tests to confirm quickly.
- Test a random sample of at least 50 URLs, not just the homepage.
- Vary user agents: curl by default, GoogleBot, standard browsers.
- Monitor all 302 codes to unknown external domains.
- Automate daily checks of your strategic URLs with curl in cron.
- Cross-check results with Search Console: spikes in 404 errors, drops in impressions.
- Check the integrity of your sensitive files (.htaccess, wp-config.php).
❓ Frequently Asked Questions
Wget et cURL détectent-ils les redirections JavaScript ?
Comment simuler GoogleBot avec cURL pour détecter un cloaking ?
Quelle différence entre un code 301 et 302 dans le contexte d'un hack ?
Faut-il tester toutes les URLs du site ou un échantillon suffit-il ?
Comment automatiser la détection de nouvelles redirections malveillantes ?
🎥 From the same video 5
Other SEO insights extracted from this same Google Search Central video · duration 7 min · published on 30/10/2013
🎥 Watch the full video on YouTube →
💬 Comments (0)
Be the first to comment.