The easy access Google's web crawlers have to sites is increasingly being exploited by cybercriminals in launching distributed denial of service attacks, a security vendor says.
Fake Web crawlers accounted for 4 percent of the total number of legitimate ones, called Googlebots, analyzed by Incapsula.
In investigating more than 50 million fake Googlebot sessions, Incapsula found about 34 percent were clearly malicious, with roughly 24 percent of those used in DDoS attacks against a website's application layer.
A Googlebot is the search software Google uses to collect documents from the Web in order to build its searchable index. Googlebot requests to Web servers are identifiable through a user-agent, which is the online equivalent of an ID card.
Cybercriminals are creating imposter user-agents to trick Web servers, Incapsula said. While careful inspection would reveal the fakes, website administrators tend to be lax when it comes to Googlebots in order to get the highest possible rankings on the search engine's results.
"Most website operators know that to block Googlebot is to disappear from Google," Igal Zeifman, product evangelist for Incapsula, said in the company's blog. "Consequently, to preserve their SEO (search engine optimization) rankings, these website owners will go out of their way to ensure unhindered Googlebot access to their site, at all times.
"In practical terms, this may translate into exceptions to security rules and lenient rate limiting practices."
Incapsula has rated fake Googlebots the third most commonly used technology in DDoS attacks. The U.S. is the top source, followed by China and Turkey, respectively.
Identifying and blocking malicious Web crawlers involves using tools that can separate the fake and legitimate ones through their point of origin.
However, such technology carries an additional cost, due to the need for more processing power and software capabilities.
The findings were based on an analysis of 400 million search engine visits to 10,000 sites, which resulted in 2.2 billion page crawls over a 30-day period.