welcome: please sign in

Revision 2 as of 2007-11-01 20:52:14

Clear message
Edit

MemberManual / Email / SpamAssassin

This page describes how to use SpamAssassin to keep junk email under control.

TableOfContents

Introduction

You will probably want to set up SpamAssassin to detect junk e-mail for you. [http://spamassassin.org/ SpamAssassin] is a program for categorizing e-mail as spam based on a wide range of criteria. It indicates its decisions by adding special headers to messages.

Please note that we will never reject any spam email before it hits your filtering rules. It is up to you to decide how to classify the email that hits your inbox.

Enabling spam detection

We use a custom tool called setsa to determine whether your email should be run through SpamAssassin. To enable SpamAssassin for mail to your UNIX account, run

setsa on

To later disable it, run

setsa off

To check whether you've enabled it or not, run

setsa

You can similarly enable or disable SpamAssassin for a virtual mailbox address by adding it as the first argument to setsa; for example, setsa user@domain.com on enables SpamAssassin for user@domain.com if you have DomainTool permissions for domain.com.

Moving spam email to a different folder

The above procedure only asks SpamAssassin to examine your mail and add extra headers indicating its verdict, spam or legit. To use these headers to move junk mail to a folder called Spam in your IMAP mailbox, copy the template /etc/.forward to ~/.public/.forward. This is an Exim filter that looks for SpamAssassin headers that indicate spamhood. You need to create a Spam folder manually to use this. You can modify this template to save spam to other places, if you don't use IMAP or prefer another scheme. (If you already have a ~/.public/.forward file because you forward all of your mail to another account elsewhere, then you can ignore this section. You should use that e-mail provider's spam filtering services.)

SpamAssassin flags spam with a spamminess level of 5.0 or higher. You can use the X-Spam-Level: header to customize your own filter to your own liking, however. As an example, you can see NathanKennedy's .forward file at the end of this page.

Training

One way that SpamAssassin spots spam is by using statistical (Bayesian) analysis. This requires lots of training data to work properly.

Sometimes this analysis will make mistakes, and you'll want to perform the electronic equivalent of slapping it with a newspaper. The way to do that is to deposit misclassified mail in special system-wide IMAP folders, one called SiteSpam for spam that SpamAssassin missed and one called SiteHam for good messages that were erroneously marked as spam.

If you ever run into this situation, here's how you can feed our system-wide trainer:

  1. First, this is only going to work if you are using IMAP. If you're not, or if you have other sources of spam or ham that you'd like handled specially, place a support request on [https://members2.hcoop.net/portal/ the portal].

  2. Use your IMAP client's "subscribe" feature to subscribe to SiteSpam and/or SiteHam, which should appear in the SpamAssassin mailbox inside the shared tree.

  3. When you want a message to be used as an example of spam or ham, place a copy of it in the appropriate folder.
  4. Every five minutes, our faithful spamhound will sniff these folders, update its data, and clear their contents.

If you would like to automate this process somewhat, check out FeedingSpamAssassin. For the curious and the sysadmins out there, SpamAssassinAdmin gives more details on how we set this up.

Example .forward file

It is possible to set up custom filters to do fancy things based on the X-Spam-Level: header. Here is NathanKennedy's ~/.public/.forward file. He finds that the default setting of 5.0 is too wimpy, and lets too much spam into his inbox. Virtually no ham that he gets scores less than 3.0, whereas a lot of spam scores less than 5.0, so he'd rather have anything over 3.0 go to his Junk folder. At the same time, he doesn't want to waste time, cycles, disk space or bandwidth with spam over 9.0. Most of his spam does score 9.0, and this goes straight to /dev/null (immediately disposed of) with this filter.

Finally, he has all HCoop list email go into a special HCoop folder.

Without further ado:

# Nathan's exim filter

logfile $home/spamlog

if $header_subject contains "[HCoop"
then
    save $home/Maildir/.HCoop/
    finish
endif

if
    "${if def:h_X-Spam-Level {def}{undef}}" is "def"
then
    if $h_X-Spam-Level: begins "\*\*\*\*\*\*\*\*\*"
    then save "/dev/null" 660
    else
      if $h_X-Spam-Level: begins "\*\*\*"
      then save $home/Maildir/.Junk/
      endif
    endif
    finish
endif

if
    "${if def:h_X-Spam-Flag {def}{undef}}" is "def"
then
    save $home/Maildir/.Junk/
    finish
endif