Spam Filter Blacklists and WhitelistsWritten by Giselle Borg Olivier on September 11, 2008
Features commonly used to filter spam are whitelists and their blacklist counterpart. Whitelisting is a method used to classify users’ email addresses as legitimate ones – often email address that are saved within one’s address book are automatically considered to be ‘whitelisted’. Spam filtering can be configured to sort messages based on a variety of criteria, including the sender’s email address, specific words in the subject or message body or by the type of attachment that accompanies the message.
Spam filters that come with email clients have both white- and blacklists of senders and keywords to look for in emails. Mail from whitelisted email addresses, domains, and/or IP addresses will always be allowed. If a whitelist is exclusive, only email from those on the whitelist will get through. If it is not exclusive then it prevents email from being deleted or sent to the junk mail folder by the spam filter. Usually it is only end-users, not internet service providers (ISPs) or email services, who would set a spam filter to delete all emails from sources not on the whitelist.
Email servers can also query any commercial blacklists or whitelists, such as the MAPS Realtime Blackhole List (RBL), Spamhaus, Bonded Sender, or Habeas, to determine if that IP address is present on the list. Once the IP address has been cleared, the email message can be transferred to the recipient’s inbox. If the IP address is found on a blacklist then the email can be rejected by refusing the email transmission and terminating the connection. Email coming from servers with blacklisted IP addresses tends to get rejected. Email from servers with whitelisted IP addresses is usually accepted and delivered.
Using whitelists and blacklists can assist in blocking unwanted messages and allowing wanted messages to get through, but they are not always accurate. Email whitelists are used to reduce the incidence of false positives, often based on the assumption that most legitimate mail will be from a relatively small and fixed set of senders. To block a high percentage of spam, email filters have to be continuously updated, as email spam senders create new email addresses to email from, or use new keywords in their email which can allow the email to slip through. Address lists of habitual spammers, known as ‘blocklists’, are continuously updated by various organizations and ISPs. Mail from blocklist addresses is rejected at the mail server.
Some Internet service providers have whitelists that they use to filter email to be delivered to their customers. ISPs receive requests from legitimate companies to add them to the ISP whitelist of companies. In order to be approved for whitelisting, companies are either required to pay or else they must pass a series of tests to prove that they are not sending out spam emails.
Non-commercial whitelists are operated by various non-profit organizations, ISPs and other entities interested in blocking spam. Rather than paying fees, the sender must pass a series of tests, for example his email server must not be an open relay, and have a static IP address. The operator of the whitelist may remove a server from the list if complaints are received from users.
Commercial whitelists are a system by which an ISP allows someone to bypass spam filters when sending email messages to its subscribers, in return for a pre-paid fee (either an annual fee or a per-message fee). A sender can then be more confident that his messages have reached their recipients without being blocked, or having links or images stripped out of them, by spam filters. The purpose of commercial whitelists is to allow companies to reliably reach their customers by email.
Commercial whitelists offered today are Bonded Sender and the Habeas Users List (HUL). ISIPP’s Accreditation Database (IADB) may also be used as a whitelist. On the other hand, being listed on the MAPS RBL blacklist means loss in revenue and severe problems for companies as their emails would be classified as spam which could lead to a damaging reputation. As spammers became more militant in their quest, blacklists became less reliable and ISPs found that blacklists started leading to false positives (legitimate mail that is marked as spam).
Meanwhile email marketers realized that the best method to avoid being wrongly blacklisted was to use an opt-in system where well-maintained blacklists recognized the ethical email messages and did not blacklist them. By engaging in responsible email practices, email marketers help to ensure that their emails were delivered to the intended users. As a result of blacklists being somewhat unreliable ISPs stopped basing their entire decision about blacklisting companies solely on them, which in turn provided a sigh of relief for those businesses whose email marketing efforts were wrongly intercepted and blacklisted.
It’s worthy to note that nowadays, most large ISPs do not rely solely on blacklists and whitelists when it comes to filtering email and determining what is spam. In order to more effectively analyze the content and not trash a real message, sophisticated spam filters use artificial intelligence (AI) techniques, such as Bayesian filtering, that look for key words and attempt to decipher their meaning in sentences, thus providing a more holistic approach to the war on spam.
However, certain smaller ISPs and spam filtering programs do rely on these lists and therefore cannot be ignored. As a sort of middle ground, a greylist is sometimes created which contains entries that are temporarily blocked or temporarily allowed. Greylist items may be reviewed or further tested for inclusion in a blacklist or whitelist.