Spam, Spam And More Spam

Posted: 03/08/10

Fending Off Spam With Grey Listing

Anyone who's had an email address for more than a week has most likely started to attract spam. Our hosting platform took on a new domain this week, which seems to be a spam magnet.

The domain was registered a while ago, and has suffered from various email addresses being shown in its web pages. This is one of the more egregious errors one can make as a web designer, since spammers use email address harvesting robots to pick out these addresses easily. Obfuscation with javascript is my preferred method of protecting addresses and is somethig that the Opus CMS does by default.

To cut a long story short, it became necessary to erect a few defences and to ward off the deluge of rubbish that was coming our way. The obvious solution is to run Spamassassin. However Spamassassin can suck a reasonably large amount of system resources, and it seems like a better plan to reject the mail before it has to be scanned.

The greylisting technique is well documented and has been a favourite method of reducing spam for a few years now. The theory is that if the machine sending you the mail is a legitimate mail server, it will try several times before giving up and bouncing the message back to the sender. Therefore, if the first attempt to deliver is met with a "service temporarily unavailable" message, delivery will be attempted at a later time. The postfix software makes it really easy to implement grey listing, in conjunction with a simple grey listing mechanism written in perl. It works by storing every unique combination of sender IP address, sender address and recipient address in a database, together with the date/time that the last connection was made. If the combination is new, then the message is temporarily rejected. If it's not new, and first occurred more than 90 seconds ago, the message is accepted for further scrutiny by Spamassassin.

The only fly in the ointment here are sites like Facebook, whose mail infrastructure means that the same message can be delivered from several different hosts, which means that each one gets grey listed and the message can take quite a while to get through. The solution here is to avoid grey listing Facebook emails, and rely on Spamassassin alone to filter out the spam.

Another useful technique is to decline to accept mail from other servers when they break the recommended SMTP dialogue rules. A legitimate and properly configured mail server run by a reputable company won't have a problem with that. A spam spewing zombie infected PC will. One rule that is particularly useful is the "HELO" or "EHLO" command. Insisting that the host name specified is a valid and resolvable fully qualified domain name catches a surprising amount of rubbish.

Having made these changes, the volume of spam that actually makes it as far as Spam Assassin is now about 20% of what it was.