Administration of SpamAssassin
Work in progress, see SpamAssassinAdmin for (outdated, but semi-accurate) information on our site-wide spam filtering.
1. Quirks
spamassassin is stores txrep and bayes databases in /var/spool/exim4/.spamassassin now that data is stored per-user, and exim is calling spamc. This works equivalently to having the global database in /var/lib/spamassassin.
2. TODO
There is much overdue work on our setup to bring it into the modern era. Roughly in order of difficulty.
2.1. Opt-Out Spam Filtering
We made spam filtering opt-in by default because when we added spam filtering spam was a relatively minor problem, and folks were more concerned with flagging legitimate mail, and causing members that did not train bayes to ruin the bayes database. Nowadays the email experience is absolutely atrocious without spam filtering, and I think the concern about bayes training is overblown (and can be solved entirely with per-user bayes).
We should invert the functioning of setsa in DomTool to only create /etc/spamassassin/addrs/ files when a member wishes to opt-out of spam filtering, and update the exim filter to match the new behavior.
2.2. Automatically Move Mail to Junk Folder
This one is actually MUCH simpler than I've thought... we just need an exim router that runs before userforward files do that moves mail to INBOX.Junk. We'll need to make sure members are OK with this.
We will also want to automatically expire mail from there, 30 days seems reasonable. It looks like we might be able to use courier's IMAP_EMPTYTRASH, although it has the limitation that it only clears mail when the member logs into IMAP, which may result in directories filling if a member does not use their mail often. A cron should be straightforward to (will need to query all vmail accounts for a member and clear Junk with the correct tokens).
2.3. Per User Bayes
We need to pass ${local_part}@${domain} (for vmail) or ${local_part}@localhost (for local users) to spamc when filtering.
This will also need a replacement for SiteSpam: probably something like INBOX.Spam, INBOX.Ham with the learn spam scripts updated to scan each all local and vmail Maildirs.
2.4. User Preferences
If want want to support user prefs to allow users to customize their spam preferences (using roundcube for example), we'd have to switch to SQL. Low priority.