Thursday, May 13, 2010

Fighting SPAM

A survey carried out by Microsoft in 2009 suggested that 97% of all emails sent over the internet are unwanted. While definitive spam statistics can be hard to confirm, it certainly amounts to billions of messages a day - with upwards estimates of around 100 billion spam messages sent per day, suggested by some sources. Even using much more conservative estimates, it all adds up to a huge amount of unwanted email worldwide. And some days it can feel like it's all ending up in your mailbox...

While spam may seem like little more than a nuisance, even it's simplest effects can have much more dramatic consequences when considered globally. While it may only take an individual user a few minutes each day to delete unwanted mail, when multiplied by all of the users in the world, the scale of the problem quickly becomes apparent. A recent study suggested that $130 billion was lost by businesses in 2009 due to spam - from the time spent by employees reading and deleting the messages, to recovering from virus infection, financial fraud from phishing sites, and the cost of the extra bandwidth incurred handling the incoming spam.

Much has been done to try and educate users about the dangers of spam - e.g. not opening attachments from unknown/untrusted senders, not confirming bank details via email, etc - but Microsoft's survey still found around 46% of users, even when warned of the risks, would still open executable files, and enter personal details on unvalidated websites. So, clearly, it's best to try and ensure as much spam as possible is stopped before it gets anywhere near a users personal email account.

Being saddled with a name derived from a Monty Python sketch probably hasn't helped spam to be taken very seriously, but if you would like to start getting serious about the problem, and help reduce your users burden of responsibility a little bit, here are some suggestions:

Spamassassin:

The most popular open-source, server-side anti-spam application is Spamassassin. This checks all email against a set of rules and assigns a score to each. If the result exceeds a pre-set limit, the email is tagged as spam and then processed (deleted/quarantined/etc) by a final set of rules.

Installing Spamassassin with a default ruleset can result in a dramatic reduction in spam. It's probably best not to try writing your own rules, unless you're familiar with how Spamassassin works - however, you might want to whitelist and blacklist certain known good/mail addresses or whole domains. For changes to a specific user's email account, you would edit the ~/.spamassassin/user_prefs file, for global changes you would edit the /etc/mail/spamassassin/local.cf file.

The basic format for whitelisting a specific email address is:

whitelist_from my.test@example.co.uk

To whitelist everything from the domain example.co.uk, we would use a wildcard instead:

whitelist_from *@example.co.uk

To blacklist an address or domain, use the directive blacklist_from instead.

To get the best results from Spamassassin, it's a good idea to install MailScanner too. MailScanner is an open-source email security system, which provides an easy to use configuration file (MailScanner.conf) where spam scores and rules can be configured. It also has support for other plugins - the two most useful being:

* Anti-virus - Usually this means using the free Clam Anti-virus package, as there are several options to specific to ClamAV available, but around 25 other packages are supported. The latest virus definitions are updated daily (you can configure it to check more frequently, if required), and positive detections can either be delivered, quarantened or automatically deleted.

* Phishing emails - Checks all URLs against a list of known phishing sites and marks the email's subject line with the warning: {Fraud?} Note: On busy mail servers this option can increase the load considerably so, once enabled, it's best to monitor the server for any sign that performance is being adversely affected.

To make monitoring MailScanner much easier, you can install MailWatch - a web-based front-end, written in PHP. This can be used to display mail queue statistics (full support for this function is only available if you're using Sendmail or Exim), as well as various spam/virus statistics, e.g. total messages, spam/viruses blocked. It also includes Quarantine management, allowing you to release, delete or run sa-learn (a Spamassassin tool for training it's Bayesian filter) across any quarantined messages.

Catchalls:

Normally, you would setup individual email addresses for each user, and any mail sent to a non-existant email address would be rejected. This is the best method to use.

If you use a catchall, it will "catch all" of these emails for the domain, that are sent to non-existant addresses, and accept them for delivery. This makes them a nice, easy target for spammers, as they don't have to use valid email addresses to get their spam through to your server. This then forces the server to accept and process a much higher volume of mail, pushing the load up and eating up resources, as it churns through all this excess mail. The result is a hugely bloated mailbox, full of spam, which usually ends up being ignored as no one can be bothered to go through all the messages to locate any legitimate ones.

The simple solution to this problem is: catchalls are evil, so DON'T use them!

Dictionary Harvest Attack:

This is a form of brute force attack used by spammers to collect valid email addresses, which will then be used as sender addresses for spam. This involves running through various permutations of common addresses, sending a tiny message - usually just a helo/ehlo - to avoid triggering anti-spam software. Any addresses that are accepted for delivery are assumed to be valid and are added to a list of addresses. You can usually tell this happening by checking your mail logs, where you will a large volume of alphabetically sequential email addresses logged and rejected, such as:

jack@mydomain.com
jill@mydomain.com
john@mydomain.com
john.a@mydomain.com
john.b@mydomain.com

The easiest way to spot this sort of behaviour is to install and configure an application such as Logwatch, which will email you a summary of the logs each day. If the connections come from the same IP address(es), or particular IP ranges, you can then block them with your firewall.

SPF Records:

This won't reduce the amount of spam received by your server, but it will reduce the amount of spoofed mail received by other servers.

Why should this matter to you? Well, the 'From' address of an email can easily be modified by the sender, allowing a spammer to make the email appear to have been sent from a different address. If they decide to fake, or 'spoof', addresses using domains hosted on your server, it could cause you problems as many mail servers/services will assume you are the source of the spam. This could result in you being black-listed, or in extreme circumstances even subjected to a retaliatory attack by disgruntled server admins or hackers, such as a DDOS attack. So using SPF records is equal parts a kindness to your fellow internet users, and part self-interest in maintaining a trustworthy reputation on the net.

SPF (or Sender Policy Framework) uses DNS entries to publish the IP addresses of the server(s) that are allowed to send mail for a domain. If a mail server then receives a message from the domain, that doesn't come from one of these valid addresses, it's fairly safe to assume it will be spam and can be deleted or quarantined.

The ultimate goal would be for every domain in the world to start validating their mail using SPF, or similar methods. If it's ever achieved, this would effectively mean the end for spam (at least in its current form), but we're a long way from that at the moment, with only around 5% of .com and .net domains currently implementing SPF records.

HELO/EHLO:

Spam is often sent from poorly configured mail servers, so you can set your server to reject any connections that issue an invalid HELO or EHLO message. This is a basic greeting all email servers issue when connecting, so it can allow you to block spammers before they've even connected to your server. However, it can also result in valid email being lost too, as there's no way to guarantee that all legitimate mail servers will be properly configured. This option is probably only worth using if you only want to handle mail sent from a set group of known senders, where you can ensure their server's greetings are valid.

Email Addresses Management:

Use multiple email addresses. When mailing clients or trusted business contacts, use your real email address. When signing up to anything online, or posting on forums, use a different email address. This will allow you to protect your real email address from spam, as there's often no way of knowing if your address details will be treated with respect and not passed on to spammers.

Even unwanted mail from legitimate businesses is spam, so it's not just adult or other 'dodgy' sites that can result in an increase in spam. Many people notice a dramatic increase in spam after signing up to social networking sites, for example. If you suspect signing up to a certain website might result in a lot of spam, create a temporary email just for that particular sign-up. Spamassassin/MailScanner can then be set to apply much more stringent checks to these addresses, and they can be removed if they prove to nothing but a spam trap.

Conclusion:

While these points are far from definitive, they should be enough to start you thinking about how to tackle the problem of spam. Even a few simple changes could make a huge difference, without too much investment of time and effort being required. For further reading, see the links below:

ClamAV - http://www.clamav.net
MailScanner - http://www.mailscanner.info
MailWatch - http://mailwatch.sourceforge.net
Spamassassin - http://spamassassin.apache.org
SPF - http://www.openspf.org

0 comments:

Post a Comment