Anti-Spam Products Are More Than the Sum of Their Parts

Written by Paul Cunningham on April 24, 2009

teamWhen you boil the spam problem down it becomes quite simple – someone is sending you emails that you don’t want to receive.  This makes the anti-spam solution a simple one too – stop unwanted emails from arriving in someone’s email account.  However, actually achieving this is a very complex task.

Any anti-spam system that is worth using will contain a range of preventative measures and features that are used to determine whether an email is likely to be spam or not.  As a complete solution they can be very effective, but taken individually and their weaknesses become more apparent.  Here are some examples.

Source IP Filtering

Also known as Connection Filtering, DNSBL, or RBL, this technique compares the source IP of an incoming SMTP connection to a list of suspected spam sources.  The list can be either a manually generated list that the email administrator creates, or can be a subscribed list by a third party provider (such as SpamHAUS).  If the IP address is on the list then the email is considered likely to be spam and the server will drop or reject it.

The weakness of this technique is when IP addresses are mistakenly included in the list.  A legitimate email server may find itself blocked by other systems that are subscribed to a particular IP list, which prevents important business email from being sent to those systems.  Similarly, some regular sources of spam emails such as free web-based email services cannot be blocked by IP address because that would certainly block a lot of legitimate email as well.

Content Filtering

Early anti-spam products made decisions about spam emails using single word matches such as “Viagra” or foul language.  This quickly proved fruitless because spammers would simply vary the word slightly in each email, for example “v1agra” and “via.gra”.  Content filtering then improved to include databases of spam phrases and patterns and would assess more of the content in an email to determine if it was spam.

The weakness of this technique is the constant game of “catch up” that is being played as spammers adapt new strategies to sneak their content past anti-spam systems.  For example, when content filtering was getting very effective spammers suddenly switched to putting all of the email text into an image file instead that the anti-spam system could not read.

Sender Verification

There are several “sender verification” standards such as Sender Policy Framework (SPF) and SenderID, each varying slightly but based on the same principle of using DNS records to verify that the sender of an email is authorized to send email for that domain name.

There are a few reasons why this technique does not perform well on its own.  Firstly, uptake of the systems among email administrators is minimal.  Without everyone participating in such a scheme the effectiveness of it is diminished.  Secondly, it only verifies that the source of the email is authorised to send for a given domain name.  Email systems that are inherently insecure and often exploited by spammers (such as web-based email services mentioned earlier) make it nearly pointless performing sender verification.

Likely Spam vs Definitely Spam

As you can see above no single anti-spam technique performs very well on its own.  However, when you combine a number of different techniques into a single system, with each technique applying a “likelihood” score to each email that is checked, the system can be quite effective.

For example, if an email is from an IP address that is not considered a likely spam source (no score increase), but contains spam-like content (score increased according to severity), and fails sender verification (increases score again) , the combined “likelihood” score may reach the configured threshold for the system and cause the email to be treated as spam.

Choosing an Anti-Spam Solution

Keep all of the above in mind when you are considering an anti-spam solution for your organization.  It can be tempting to look at a “home brew” solution made up of individual system dedicated to each technique, as these associates of mine did recently.  Aside from the administrative overhead the overall effectiveness of the system is going to be far lower than a proper multi-featured anti-spam solution.

Liked this post? Share it!
  • Digg
  • StumbleUpon
  • del.icio.us
  • Slashdot
  • Technorati
  • Reddit
  • NewsVine
  • Facebook
  • Google Bookmarks
  • TwitThis
  • Mixx
  • Furl
  • Live
  • Ma.gnolia

Related Posts

4 Responses to “Anti-Spam Products Are More Than the Sum of Their Parts”

  1. Gregg Oldring Says:

    Very well written Paul. I often have to explain to senders that spam filters usually assess multiple factors. My descriptions are not nearly as clear or concise.

    Thanks!

  2. Paul Says:

    Hi Gregg, thanks for your comment, I’m glad you enjoyed the article.

  3. Barry Leiba Says:

    On sender verification, something you surely know, Paul, but that readers might misunderstand:
    This technique, whether done by IP address (as with SPF and Sender-ID) or by digital signature (as with DKIM), isn’t meant to be an anti-spam mechanism at all. No spam filter should give a message a better score because it passed an SPF or DKIM check, nor a worse one just because it failed it.

    These mechanisms validate that the sending domain is what it purports to be, are meant to be combined with some sort of reputation system, and are only useful in that regard. They can be very powerful white-listing mechanisms, when combined with a list of “known good” domains, preventing legitimate messages from, say, PayPal or Citibank from being mistakenly classified as spam. Conversely, given the knowledge that PayPal fully complies with SPF, or signs all its mail with DKIM, mail that purports to come from PayPal could have its spam score increased a great deal if it fails the test.

    The press often gets this wrong, reporting that, for example, SPF is supported by more spam domains than legitimate ones, implying that that’s a criticism of the technique. Quite the opposite: we’re very happy when spammers make a point of proving who they are. It ultimately makes it much easier to filter their mail.

    – Barry Leiba
    DKIM working group chair

  4. Dealing With New Spam Threats to Business Says:

    [...] the need to deploy serious protection for email spam.  A “bits and bobs” solution cobbled together from separate free components will not have the effectiveness of a comprehensive, integrated anti-spam product from a vendor [...]

Leave a Reply

Comment Policy