trenchant.org

by adam mathes  ·  subscribe  ·  RSS  ·  archive

Referrer Log Spam

Although getting spam via email is annoying, since installing SpamAssassin it’s become mostly manageable. Unfortunately some of it still gets through, but since I’ve had an email address posted on my website since 1996, I’ve decided to accept the tradeoff of a few spam emails in return for the ease of use in contacting me. (The longer term solution is probably to start using a white-list based solution along with a challenge-response system, but that’s another issue.)

What I’m not so accepting of, however, is referrer spam.

Most of the time, when you visit a website, your browser sends certain information to the web server you are getting pages from. This usually includes what kind of web browser you’re using, what operating system, and the referring web page you clicked on to arrive at this page. Most servers keep logs of this, and many people perform analysis so they can see things like what browser versions people are using.

But the most interesting things to look at are the referrers, since you can see what search engine results people found your site from, or who’s linking and talking about your site. Webloggers, in particular, pay particular attention to referral logs.

Some people were so fascinated by these referrer logs that they methods that allowed referrers to show up on their web pages, for all to see.

And this has led to referrer log spam — people writing programs to lie to people’s web sites and say people visited from some spammmers site.

Since today’s search engines, following Google’s lead, pay the most attention to inbound links, a way to automatically add a link to somebody’s pages without their knowledge is a spammer’s dream come true. It’s like free pagerank, leading to better search engine placement, leading to more traffic, leading to more “free money,” as the spammers logic goes.

Two of the sites that referrer spammed me this weekend had over 90 inbound links on Sunday according to Google, all from referrer log pages.

And it really fucking annoys me.

My site does not publish referrer logs, or allow trackbacks, or allow any random bot or user to post moronic spam to my discussion board. And yet I still have to deal with the utterly asinine and irresponsible design decisions of self-important webloggers and weblog software developers who wanted to better foster “discussion” by automatically creating links.

Because this would have never become a problem if software developers acknowledged the rather obvious fact that a way for anonymous users to automatically add content to a site that isn’t theirs is a really bad idea that is ripe for abuse.

The only way to really stop referrer spam would be if all the people who ran websites locked down their referrer logs and made private instead of publicly accessible. But even if a large portion did that, the small percentage that didn’t would just encourage the spammers to hit everybody in the hopes it would show up in the unsecured ones.

So the reality is I have to accept that my referrer logs are tainted, and probably eventually write a script that wastes bandwidth and checks if the referring page actually contained the link it purported to.

Once again, webloggers take something that used to be fun (looking at referrer logs) and RUIN IT FOREVER.