Why you need to keep false-positives to a minimum

Written by Brett Callow on February 5, 2009

Fact: spam cannot be blocked with 100% accuracy. No matter which anti-spam product you use and no matter how many hours you spend configuring it, it’s absolutely inevitable that some spam will slip through the net and that some valid email will end up being blocked. While a few missed spam emails may not be too much of a problem, a few blocked valid emails can be extremely damaging. I mean, who wants to deal with a company that doesn’t reply to emails? But the consequences of wrong trashed emails can be worse than simply lost business and damaged customer relationships.

In 2007, the Washington Post reported the case of Franklin D. Azar & Associates. The basics of the story are as follows. The law firm ramped up the settings on its anti-spam in order to block pornographic emails which had been reaching users’ desktops. The appliance started blocking the unwanted messages, but it also started blocking emails from the United States District Court for the District of Colorado which caused Azar and Associates to miss a court hearing. The judge subsequently ordered that the Azar and Associated pay the costs of the opposing counsel who did appear. The judge commented that “It is incumbent upon attorneys to adopt internal office procedures that ensure the court’s notices and orders are brought to their attention once they have been received,” and “That it would have been a very simple task to whitelist the United States District Court for the District of Colorado’s domain name of “cod.uscourts.gov” to ensure that such emails with this domain name would always be received.”

Yup, the judge was right when he said that it would have been a “very simple task to whitelist the United States District Court for the District of Colorado’s domain name.” What wouldn’t have been so simple, however, is to whitelist each and every one of the law firm’s contacts. Creating and managing whitelists can be an extremely time consuming process. Exchange Server 2007 eases the burden somewhat with a feature known as Safelist Aggregation. From Microsoft:

          In Microsoft Exchange Server 2007, the term safelist aggregation refers to a set of anti-spam functionality that is shared across Microsoft Office Outlook and Exchange. This functionality collects data from the anti-spam Safe Recipients Lists or Safe Senders Lists and contact data that Outlook users configure and makes this data available to the anti-spam agents on the computer that has the Edge Transport server role installed. Safelist aggregation can help reduce the instances of false-positives in anti-spam filtering that is performed by the Edge Transport server.

But while Safelist Aggregation does indeed make things somewhat easier, it’s certainly not a perfect solution as it is reliant upon users having whitelisted their contacts – and that’s something they often do not do. Furthermore, even if users have whitelisted their contacts, they may have whitelisted domains rather than addresses and you’ll probably not want to aggregate that data (you don’t want emails from Hotmail addresses to be unfiltered, do you?).

There is, however, an alternative. Some anti-spam products include a feature which, when turned on, enables the addresses of emails to which users have replied to be automatically whitelisted. This will not eliminate the possibility that valid emails will be blocked, but it will make it substantially less likely. When the time comes for you to go shopping for an anti-spam solution, this is certainly a feature which you should add to your “must-have” list.

Comments

Steve Freegard February 6, 2009

Interesting article. The root problem is software that discards spam that exceeds a certain threshold or having a quarantine that either isn’t checked or has so much mail in it that the user can’t see the false-positive amongst all the spam due to the low signal-to-noise ratio common to a lot of spam filters.

I’ve long come to the conclusion that with spam volume exceeding 80% of input at many sites now – the far better way to deal with this is to do anti-spam at the SMTP phase rather than post-queue; if it scores above one threshold then tag the message (e.g. ad [Spam] to the subject and an X-Spam-Status: YES header) and deliver it to the user and if it exceeds another higher threshold then reject it outright.

This is far better as it uses SMTP like it was designed – the sending server has the burden of responsibility to deliver the message; if you reject the message at the SMTP level the sender has to generate a non-delivery receipt (NDR) to the sender informing them the message wasn’t delivered. If the message was from a ‘real’ user then they know it wasn’t delivered and can take action (the NDR will contain the rejection message that you sent – this could include a URL for whitelisting or a unfiltered address etc.), if the message was spam then the spammer won’t care and will move on. No backscatter is caused by this method, false-positives are no longer a big disaster and it avoids the issues referred to in your article.

My only wishes are that Exchange could be configured to automatically move messages to the Junk Folder in Outlook based on incoming headers without having to write custom event sinks. And that it doesn’t default to mangling incoming NDRs into it’s own format losing a lot of the relevant data in the process (e.g. like ‘Friendly HTTP error messages’ setting in IE).

I agree with your sentiments about aggregated whitelists – I’ve seen many users add ‘hotmail.com’ to them and then complain about the onslaught of spam that then follows. When aggregating lists now – I only accept full e-mail addresses (e.g. user@domain.com format); not sure if the Safelist Aggregation in Exchange can do that – but if it can; it’s definitely the way to go.

Zerolove February 6, 2009

So true, nothing worst then B2B email’s being marked as spam. This is a big no no at my job. But on a side note, I deal with some very large law offices. One is a bankruptcy office, and when the bankruptcy courts went digital and they started sending auto replies back to the clients they used xxx@something.uscourts.gov this has now changed for obvious reasons.

Pingback: Weekend reading - subject: exchange

  • (required)
  • (required)