welcome: please sign in

Diff for "SpamAssassinAdmin"

Differences between revisions 6 and 14 (spanning 8 versions)
Revision 6 as of 2007-09-16 15:40:14
Size: 4644
Editor: MichaelOlson
Comment: Fix mistaken assumptions my earlier attempt
Revision 14 as of 2010-11-29 17:33:53
Size: 4745
Editor: ClintonEbadi
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
{{{#!wiki warning
'''This does not not adequately describe our current setup'''
}}}
Line 5: Line 9:
 1. Perform the following as `spamd`:
   1. `cd ~spamd`
   1. `maildirmake -S Maildir`, to create the shared Spam``Assassin mailbox.
   1. `maildirmake -f SiteSpam -s write Maildir`, to create a writable folder for misclassified spam (or if extracting from a tarball, make sure it has the sticky bit set by doing {{{chmod +s Maildir/.SiteSpam/*}}}).
   1. `maildirmake -f SiteHam -s write Maildir`, to create a writable folder for misclassified ham (or if extracting from a tarball, make sure it has the sticky bit set by doing {{{chmod +s Maildir/.SiteHam/*}}}).
 1. Add the following to `~spamd/.crontab` to learn from and delete messages in those shared folders every five minutes (changing MACHINENAME to be the name of the local machine):
 1. Add the following to `~spamd/.crontab` to learn from and delete messages in those shared folders every five minutes (changing MACHINENAME to be the name of the local machine): {{{#!wiki note
The crontab and other files should be stored in a git repo rather than copied verbatim from this page
}}}
Line 12: Line 13:
MAILTO=logs@MACHINENAME.hcoop.net PATH=/afs/hcoop.net/common/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
MAILTO=logs@hopper.hcoop.net
Line 14: Line 16:
# NOTE: Once you are certain that sa-learn is working, add "> /dev/null" after it, but before "; find"
Line 16: Line 17:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * sa-learn --spam --dir /var/local/lib/spamd/Maildir/.SiteSpam/cur ; find /var/local/lib/spamd/Maildir/.SiteSpam/cur -type f -delete 0,10,20,30,40,50 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --spam
Line 18: Line 19:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * sa-learn --ham --dir /var/local/lib/spamd/Maildir/.SiteHam/cur ; find /var/local/lib/spamd/Maildir/.SiteHam/cur -type f -delete
# Remove any artifacts that were submitted while running sa-learn
3 3 * * * find /var/local/lib/spamd/Maildir/.SiteHam/tmp -type f -delete ; find /var/local/lib/spamd/Maildir/.SiteSpam/tmp -type f -delete
5,15,25,35,45,55 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --ham
# Remove any tmp cruft
3 3 * * * run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteHam/tmp -noleaf -type f -delete ; run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteSpam/tmp -noleaf -type f -delete
# Remove stale lock file
3 4 * * * find /var/local/lib/spamd/ -mindepth 1 -maxdepth 1 -type f -name '.lock' -ctime +2 -delete
Line 23: Line 27:
 1. Modify `/etc/spamassassin/local.cf` with the directive:
    {{{
# Location of bayes data
bayes_path /var/local/lib/spamd/bayes/.spamassassin/bayes

# Fix bayes permissions
bayes_file_mode 0770

# Directives from old setup
# [any custom stuff from the old /etc/spamassassin/local.cf that you want to keep]
 1. Copy the `learn-spam` script from the `spam` directory of the hcoop "misc" repository into the directory `~spamd/scripts`.
 1. Checkout a copy of the hcoop spamassassin configuration into `/etc/spamassassin` from git. If the version of spamassassin is much newer than the version the configuration was created againt, a diff between the two and any relevant changes should be merged into the git repository.
   {{{
cd /etc && git clone /afs/hcoop.net/user/h/hc/hcoop/.hcoop-git/config/spamassassin.git/
Line 34: Line 32:
 1. Modify `/etc/default/spamassassin` by setting `OPTIONS` and `ENABLED`as follows. The `-x` prevents `spamd` from trying to look for per-user configuration, which would be silly because it always runs as the same user here. Without this flag, the cron job triggered every 5 minutes would log an error message, which would lead to an e-mail being sent to the `spamd` user.  1. Modify `/etc/default/spamassassin` by setting `OPTIONS` and `ENABLED`as follows. The `-x` prevents `spamd` from trying to look for per-user configuration, which would be silly because it always runs as the same user here. Without this flag, the cron job triggered every 5 minutes would log an error message, which would lead to an e-mail being sent to the `spamd` user. {{{#!wiki note
This also should be stored in a git repository, but is not. Check the current spamd server for the current values if needed.
}}}
Line 39: Line 39:
OPTIONS="--create-prefs --max-children 5 --helper-home-dir=/var/local/lib/spamd -u spamd -x -s /var/log/spamd.log" OPTIONS="--create-prefs --max-children 18 --helper-home-dir=/var/local/lib/spamd -u spamd -x -s /var/log/spamd.log -A 69.90.123.67,127.0.0.1 -i 0.0.0.0"
Line 42: Line 42:

CRON=1
Line 60: Line 62:

{{{#!wiki caution
The following tasks would be done on whichever machine runs IMAP, and are mostly wrong.
}}}

Line 71: Line 79:
CategoryObsolete

This does not not adequately describe our current setup

Here's how we set up our site-wide SpamAssassin bayes database, including the ability for users to train it.

  1. Create a new user spamd with home /var/local/lib/spamd.

  2. Add "spamd" to /etc/cron.allow.

  3. Add the following to ~spamd/.crontab to learn from and delete messages in those shared folders every five minutes (changing MACHINENAME to be the name of the local machine):

    The crontab and other files should be stored in a git repo rather than copied verbatim from this page

    • PATH=/afs/hcoop.net/common/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
      MAILTO=logs@hopper.hcoop.net
      
      # Learn from submitted spam
      0,10,20,30,40,50 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --spam
      # Learn from submitted ham
      5,15,25,35,45,55 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --ham
      # Remove any tmp cruft
      3 3 * * * run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteHam/tmp -noleaf -type f -delete ; run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteSpam/tmp -noleaf -type f -delete
      # Remove stale lock file
      3 4 * * * find /var/local/lib/spamd/ -mindepth 1 -maxdepth 1 -type f -name '.lock' -ctime +2 -delete

      Be sure there's a newline after the last line, or it won't be processed.

  4. Copy the learn-spam script from the spam directory of the hcoop "misc" repository into the directory ~spamd/scripts.

  5. Checkout a copy of the hcoop spamassassin configuration into /etc/spamassassin from git. If the version of spamassassin is much newer than the version the configuration was created againt, a diff between the two and any relevant changes should be merged into the git repository.

    • cd /etc && git clone /afs/hcoop.net/user/h/hc/hcoop/.hcoop-git/config/spamassassin.git/
  6. Modify /etc/default/spamassassin by setting OPTIONS and ENABLEDas follows. The -x prevents spamd from trying to look for per-user configuration, which would be silly because it always runs as the same user here. Without this flag, the cron job triggered every 5 minutes would log an error message, which would lead to an e-mail being sent to the spamd user.

    This also should be stored in a git repository, but is not. Check the current spamd server for the current values if needed.

    • # Change to one to enable spamd
      ENABLED=1
      
      OPTIONS="--create-prefs --max-children 18 --helper-home-dir=/var/local/lib/spamd -u spamd -x -s /var/log/spamd.log -A 69.90.123.67,127.0.0.1 -i 0.0.0.0"
      
      PIDFILE="/var/local/lib/spamd/pid"
      
      CRON=1
  7. Make a file called /etc/logrotate.d/spamd with the following contents.

    • /var/log/spamd.log {
              weekly
                      missingok
                      create 0640 root adm
                      rotate 4
                      compress
                      delaycompress
                      sharedscripts
                      postrotate
                      [ -f '/var/local/lib/spamd/pid' ] && (kill -HUP `cat /var/local/lib/spamd/pid`) || exit 0
                      endscript
      }
  8. Start the daemon by doing /etc/init.d/spamassassin start. Check /var/log/spamd.log to be sure that it started OK.

  9. Install the .crontab entries that you wrote earlier by doing crontab -u spamd ~spamd/.crontab as root. Do this every time that you make changes to ~spamd/.crontab.

The following tasks would be done on whichever machine runs IMAP, and are mostly wrong.

  1. Edit /etc/courier/shared/index as follows, being sure to separate each column with a single TAB character. The second column is UID, and third column is GID -- consult /etc/passwd and /etc/group to make these match the spamd user and group.

    • spamd   116     119     /var/local/lib/spamd
  2. Restart courier's IMAP process: runsv restart courier-imap

  3. Test by checking to see if you can access shared.SpamAssassin.SiteHam and shared.SpamAssassin.SiteSpam from IMAP. If not, do maildirmake --add SpamAssassin=~spamd/Maildir ~/Maildir as your normal user from the machine that does courier (and presumably spamassassin as well). You might need to replace ~ with ~USERNAME if you are using sudo to do this, where USERNAME is your normal username.

  4. Now copy some spammy mail into the SiteSpam directory, wait 5 minutes, and check to see if the mail got learned and deleted.

  5. If so, edit ~spamd/.crontab to pipe the output of sa-learn to /dev/null, and run crontab as specified earlier to propogate this change.


CategorySystemAdministration CategoryObsolete

SpamAssassinAdmin (last edited 2021-11-06 18:42:33 by ClintonEbadi)