Update: Spammers abusing Google Rich Snippets to boost Scam Sites

Editor’s Note: Updated to add official comment from Google.
Spammers prove the rule that says criminals will always stay one step ahead of the law. That’s why – despite predictions from some of the technology industry’s best and brightest (*ahem* Bill Gates) that spamming would be eradicated  it survives (and thrives) even today.

One way that spammers continue to stay in business is by latching on to new technology – any new technology – that might give them an edge in reaching more potential victims and luring them in. Spammers were among the first to recognize the importance of technologies like Search Engine Optimization (SEO) in driving traffic to web sites. They’re willing to try any new social media platform – no matter how nascent. And they don’t cling to technology or methods that don’t work. When the Internet community got hip to how loosely monitored infrastructure like open proxies (PDF) contributed to the spam problem, the spammers shifted to using criminal botnets to keep things humming.

Google Results Rich Snippet
Spammers are using microdata snippets to make their phony pages look legitimate, a researcher warns.

So what’s hot in spamming circles today? According to this post on the Unmask Parasites blog, its Google’s “rich snippets” microdata and micro formatting technology, which spammers are using to make hollow redirector web sites look like popular, socially vetted storefronts.

Writing on the blog, Denis Sinegubko, the Russian malware researcher who writes Unmask Parasites, said that spammers are using the ability of the Google search engine to parse so-called “structured data” in what he describes as a “massive SEO” campaign involving compromised WordPress and Joomla web sites that are funneling legitimate traffic to unrelated web sites controlled by the spammers, or by third parties who are paying the spammers to drive traffic to their doorstep.

Google’s “rich snippets” ratings microdata figures prominently in the scam. After compromising the legitimate “doorway” web sites, the hackers install PHP code that is used to “cloak” the site: detecting search engine crawlers and replacing keywords and site content with SEO-optimized spam content. Part of the content that is added are special microformats such as site rating data that Google treats as legitimate and converts into ratings  that appear in search results list.

For unsuspecting users, the result is that compromised sites display what appear to be legitimate user reviews that make the link in question look legitimate and popular.

It is unclear whether Google is aware of the misuse of the rich snippets microdata feature. According to information posted online, rich snippets are intended to give webmasters the ability to “provide users with more information on the content on a page so that they can better decide which result is relevant for their query.” The company wasn’t able to offer comment prior to publication. (I’ll update the story once they do comment.)

In an e-mail statement, a Google spokeswoman said that the company has published guidelines for proper use of the rich snippets.

“It is an unfortunate reality that with every feature we launch there are spammers who will try to abuse the feature.  We’ve published guidelines around what we consider appropriate use of rich snippets and a form to report abuses. We are also constantly improving our algorithms to make sure that our snippets are as relevant as possible,” the spokeswoman said in a statement.

Ratings are just one kind of rich snippet, along with reviews, events, recipes, people. They can be implemented by adding simple markup text to the underlying HTML for the page. Google said that it is important for rich snippet markup code to “accurately represents the primary content of your page,” but it doesn’t appear that the search giant is monitoring or enforcing that.

Sinegubko said that he has detected black hat SEO campaigns using the ratings rich snippet technique to push phony online pharmacies, payday loans and pornography. The compromised pages that have the microdata are used as doorways to the spammers’ web sites, or as feeders intended to boost the ranking of other sites.

The use of rich snippets is window dressing. The underlying cloaking attacks are more dangerous. To spot those, web site administrators are advised to pay attention to what search terms visitors are using to reach their site. The terms should be germane to the content of the site. If they’re not – especially if visitors are coming by way of searchers for “Viagra,” pornography or other hot topics, you should review your site content to make sure it hasn’t been compromised, Unmask Parasites advises.