Size: 1427
Comment: Add to sysadmin category
|
Size: 4745
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
{{{#!wiki warning '''This does not not adequately describe our current setup''' }}} |
|
Line 3: | Line 7: |
1. Create a new user `spamd` with home `/home/spamd`. 1. Perform the following as `spamd`: 1. `cd /home/spamd` 1. `maildirmake -S Maildir`, to create the shared Spam``Assassin mailbox. 1. `maildirmake -f SiteSpam -s write Maildir`, to create a writable folder for misclassified spam. 1. `maildirmake -f SiteHam -s write Maildir`, to create a writable folder for misclassified ham. 1. Add the following to `spamd`'s crontab to learn from and delete messages in those shared folders every five minutes: |
1. Create a new user `spamd` with home `/var/local/lib/spamd`. 1. Add "spamd" to `/etc/cron.allow`. 1. Add the following to `~spamd/.crontab` to learn from and delete messages in those shared folders every five minutes (changing MACHINENAME to be the name of the local machine): {{{#!wiki note The crontab and other files should be stored in a git repo rather than copied verbatim from this page }}} |
Line 11: | Line 13: |
0,5,10,15,20,25,30,35,40,45,50,55 * * * * sa-learn --spam --dir /home/spamd/Maildir/.SiteSpam/cur >/dev/null; cd /home/spamd/Maildir/.SiteSpam/cur ; ls | xargs -r rm 0,5,10,15,20,25,30,35,40,45,50,55 * * * * sa-learn --ham --dir /home/spamd/Maildir/.SiteHam/cur >/dev/null; cd /home/spamd/Maildir/.SiteHam/cur ; ls | xargs -r rm}}} The funny `xargs`-based way of deleting files is important, because `ls` has a hard limit on how many files it can handle, which we learned the hard way! Also, be sure there's a newline after the last line, or it won't be processed. 1. Modify `/etc/spamassassin/local.cf` with the directive: |
PATH=/afs/hcoop.net/common/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin MAILTO=logs@hopper.hcoop.net # Learn from submitted spam 0,10,20,30,40,50 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --spam # Learn from submitted ham 5,15,25,35,45,55 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --ham # Remove any tmp cruft 3 3 * * * run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteHam/tmp -noleaf -type f -delete ; run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteSpam/tmp -noleaf -type f -delete # Remove stale lock file 3 4 * * * find /var/local/lib/spamd/ -mindepth 1 -maxdepth 1 -type f -name '.lock' -ctime +2 -delete }}} '''Be sure there's a newline after the last line, or it won't be processed.''' 1. Copy the `learn-spam` script from the `spam` directory of the hcoop "misc" repository into the directory `~spamd/scripts`. 1. Checkout a copy of the hcoop spamassassin configuration into `/etc/spamassassin` from git. If the version of spamassassin is much newer than the version the configuration was created againt, a diff between the two and any relevant changes should be merged into the git repository. {{{ cd /etc && git clone /afs/hcoop.net/user/h/hc/hcoop/.hcoop-git/config/spamassassin.git/ }}} 1. Modify `/etc/default/spamassassin` by setting `OPTIONS` and `ENABLED`as follows. The `-x` prevents `spamd` from trying to look for per-user configuration, which would be silly because it always runs as the same user here. Without this flag, the cron job triggered every 5 minutes would log an error message, which would lead to an e-mail being sent to the `spamd` user. {{{#!wiki note This also should be stored in a git repository, but is not. Check the current spamd server for the current values if needed. }}} |
Line 16: | Line 36: |
bayes_path /home/spamd/}}} 1. Modify `/etc/default/spamassassin` by adding `-u spamd` to `OPTIONS`. |
# Change to one to enable spamd ENABLED=1 OPTIONS="--create-prefs --max-children 18 --helper-home-dir=/var/local/lib/spamd -u spamd -x -s /var/log/spamd.log -A 69.90.123.67,127.0.0.1 -i 0.0.0.0" PIDFILE="/var/local/lib/spamd/pid" CRON=1 }}} 1. Make a file called `/etc/logrotate.d/spamd` with the following contents. {{{ /var/log/spamd.log { weekly missingok create 0640 root adm rotate 4 compress delaycompress sharedscripts postrotate [ -f '/var/local/lib/spamd/pid' ] && (kill -HUP `cat /var/local/lib/spamd/pid`) || exit 0 endscript } }}} 1. Start the daemon by doing {{{/etc/init.d/spamassassin start}}}. Check `/var/log/spamd.log` to be sure that it started OK. 1. Install the `.crontab` entries that you wrote earlier by doing {{{crontab -u spamd ~spamd/.crontab}}} as root. Do this every time that you make changes to `~spamd/.crontab`. {{{#!wiki caution The following tasks would be done on whichever machine runs IMAP, and are mostly wrong. }}} 1. Edit {{{/etc/courier/shared/index}}} as follows, being sure to separate each column with a single TAB character. The second column is UID, and third column is GID -- consult `/etc/passwd` and `/etc/group` to make these match the `spamd` user and group. {{{ spamd 116 119 /var/local/lib/spamd }}} 1. Restart courier's IMAP process: {{{runsv restart courier-imap}}} 1. Test by checking to see if you can access {{{shared.SpamAssassin.SiteHam}}} and {{{shared.SpamAssassin.SiteSpam}}} from IMAP. If not, do {{{maildirmake --add SpamAssassin=~spamd/Maildir ~/Maildir}}} as your normal user from the machine that does courier (and presumably spamassassin as well). You might need to replace `~` with `~USERNAME` if you are using sudo to do this, where USERNAME is your normal username. 1. Now copy some spammy mail into the {{{SiteSpam}}} directory, wait 5 minutes, and check to see if the mail got learned and deleted. 1. If so, edit {{{~spamd/.crontab}}} to pipe the output of sa-learn to /dev/null, and run crontab as specified earlier to propogate this change. |
Line 20: | Line 79: |
CategoryObsolete |
This does not not adequately describe our current setup
Here's how we set up our site-wide SpamAssassin bayes database, including the ability for users to train it.
Create a new user spamd with home /var/local/lib/spamd.
Add "spamd" to /etc/cron.allow.
Add the following to ~spamd/.crontab to learn from and delete messages in those shared folders every five minutes (changing MACHINENAME to be the name of the local machine):
The crontab and other files should be stored in a git repo rather than copied verbatim from this page
PATH=/afs/hcoop.net/common/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin MAILTO=logs@hopper.hcoop.net # Learn from submitted spam 0,10,20,30,40,50 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --spam # Learn from submitted ham 5,15,25,35,45,55 * * * * /var/local/lib/spamd/scripts/learn-spam-wrapper --ham # Remove any tmp cruft 3 3 * * * run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteHam/tmp -noleaf -type f -delete ; run-in-pagsh --fg clean find /var/local/lib/spamd/Maildir/.SiteSpam/tmp -noleaf -type f -delete # Remove stale lock file 3 4 * * * find /var/local/lib/spamd/ -mindepth 1 -maxdepth 1 -type f -name '.lock' -ctime +2 -delete
Be sure there's a newline after the last line, or it won't be processed.
Copy the learn-spam script from the spam directory of the hcoop "misc" repository into the directory ~spamd/scripts.
Checkout a copy of the hcoop spamassassin configuration into /etc/spamassassin from git. If the version of spamassassin is much newer than the version the configuration was created againt, a diff between the two and any relevant changes should be merged into the git repository.
cd /etc && git clone /afs/hcoop.net/user/h/hc/hcoop/.hcoop-git/config/spamassassin.git/
Modify /etc/default/spamassassin by setting OPTIONS and ENABLEDas follows. The -x prevents spamd from trying to look for per-user configuration, which would be silly because it always runs as the same user here. Without this flag, the cron job triggered every 5 minutes would log an error message, which would lead to an e-mail being sent to the spamd user.
This also should be stored in a git repository, but is not. Check the current spamd server for the current values if needed.
# Change to one to enable spamd ENABLED=1 OPTIONS="--create-prefs --max-children 18 --helper-home-dir=/var/local/lib/spamd -u spamd -x -s /var/log/spamd.log -A 69.90.123.67,127.0.0.1 -i 0.0.0.0" PIDFILE="/var/local/lib/spamd/pid" CRON=1
Make a file called /etc/logrotate.d/spamd with the following contents.
/var/log/spamd.log { weekly missingok create 0640 root adm rotate 4 compress delaycompress sharedscripts postrotate [ -f '/var/local/lib/spamd/pid' ] && (kill -HUP `cat /var/local/lib/spamd/pid`) || exit 0 endscript }
Start the daemon by doing /etc/init.d/spamassassin start. Check /var/log/spamd.log to be sure that it started OK.
Install the .crontab entries that you wrote earlier by doing crontab -u spamd ~spamd/.crontab as root. Do this every time that you make changes to ~spamd/.crontab.
The following tasks would be done on whichever machine runs IMAP, and are mostly wrong.
Edit /etc/courier/shared/index as follows, being sure to separate each column with a single TAB character. The second column is UID, and third column is GID -- consult /etc/passwd and /etc/group to make these match the spamd user and group.
spamd 116 119 /var/local/lib/spamd
Restart courier's IMAP process: runsv restart courier-imap
Test by checking to see if you can access shared.SpamAssassin.SiteHam and shared.SpamAssassin.SiteSpam from IMAP. If not, do maildirmake --add SpamAssassin=~spamd/Maildir ~/Maildir as your normal user from the machine that does courier (and presumably spamassassin as well). You might need to replace ~ with ~USERNAME if you are using sudo to do this, where USERNAME is your normal username.
Now copy some spammy mail into the SiteSpam directory, wait 5 minutes, and check to see if the mail got learned and deleted.
If so, edit ~spamd/.crontab to pipe the output of sa-learn to /dev/null, and run crontab as specified earlier to propogate this change.