Installing Mailscanner on Debian-testing - with exim 4 and clamAV

Spam. Email viruses. Do I need to write a paragraph about why we
loathe them and want to be rid of them? Thought not.



So, skipping "Why", on to "How"? Fortunately (and
unsurprisingly) there's a nice open-source solution in the form of
MailScanner ( href="http://www.mailscanner.info">http://www.mailscanner.info), a
mail-processing tool which harnesses the power of SpamAssassin ( href="http://spamassassin.apache.org">http://spamassassin.apache.org), plus
any of a range of anti-virus software and open-source SMTP servers,
throws in (quite a bit of) intelligence of its own, and generally
provides customised industrial-strength mail protection for free. As in,
well, something.



Even more fortunately for Debian afficianados such as myself, all you
need is neatly packaged up and available in the "testing" distribution
(aka "Sarge").



Now for the downside. You'll need to do a fair bit of tinkering once
you've installed the required packages, to get the system running
properly. Which would, of course, be a lot easier with a clear set of
instructions tailored to your setup.



Well, this article might help, if you're using the packages described
above. I've installed two mailscanner systems with this list in the past
month, and I'll try and cover most of the issues I found.



First of all, before we even look at apt-get'ing the
packages, the first gotcha. If you're previously installed Perl or
SpamAssassin from source on your nice clean Debian server, you'll
probably want to get rid of them. Otherwise, the different module and
configuration file versions utterly confuse SpamAssassin (which will be
installed as v3.xx at the time of writing). They won't simply be
replaced when you upgrade - Debian puts them in different places, and
SpamAssassin then tries to parse both sets when it runs.

Equally, if you've got a different SMTP engine in place (or exim3
instead of 4, as I did), make sure all your mail config and spool files are kept
safe, and consider the risks and issues inherent in changing SMTP
servers. I'm sure I don't need to list those here.



Right, enough warnings. If you've got a clean (or cleaned)
Debian-testing install on your server, take a deep breath and a root
prompt, and type:

apt-get install mailscanner spamassassin exim4 clamav perl razor
libnet-dns-perl libnet-ldap-perl libmail-spf-query



That's quite a few packages, and to tell the truth Debian's package
system would probably with a shorter list and bring in most of the rest
as dependencies. But I'll include them all for the sake of completeness
here, and also bring in a few optional packages (razor and the lib...
perl libraries) that make the system run a little better. Additionally,
you'll get all the latest versions this way, which always makes getting
help easier - talking of which, see the end of this article for mailing
lists.



Now, with any luck, you'll either have everything in place on your
system, or dpkg (the debian package manager) will start asking you
pointed questions about your mail setup for exim4's sake.



Configure Exim



Note: the configuration of exim4 on Debian-testing is somewhat
involved. By default, you don't just get an /etc/exim4/exim4.conf file,
instead you get a template config file and dpkg helps you fill it in. />
You should have a general idea of your requirements - which domains you
will be accepting and relaying mail for, and which hosts you'll accept
outgoing mail for. In general, you'll set these up exactly as you would
for a non-scanning mailserver. However, if you're using Mailscanner in a
scan-and-forward capacity (eg, scanning mail then passing it on to a
Microsoft Exchange server), you'll do things a little differently:



  • Set the MX records to point to the Mailscanner box, but then:

  • Set the domains you're accepting mail for as relay
    domains, not local domains. This stops the server from trying to do
    validation on users it knows nothing about.

  • Use a hubbed_hosts router in exim4 to ignore the MX record (you
    don't want to send scanned mail straight back to yourself!) and instead
    forward all mail to the selected server. Sorry, you'll have to look up
    the details on this one yourself!


Once you've passed this data into dpkg's interface (which you can
re-run with dpkg-reconfigure exim4 - don't select the
option to split config files, by the way), you'll also need to edit the
template file that dpkg uses for this - it's
/etc/exim4/exim4.conf.template. This is because you'll
essentially be chopping exim4 in half. One half will receive incoming
mail and spool it for MailScanner to feed on, and the latter half will
be called by MailScanner to actually deliver the scanned mail. These
parts will be referred to as the 'default' part and the 'outgoing' part
respectively.


To create this behaviour, you'll need to add the following in the top
part of exim4.conf.template:


.ifdef OUTGOING
SPOOLDIR=/var/spool/exim4
.else
SPOOLDIR=/var/spool/exim4_incoming
queue_only = true
queue_only_override = false
.endif

Once you've editted this file, restart exim by
invoke-rc.d exim4 restart or /etc/init.d/exim4
restart
- both will cause the 'real' config file to be
regenerated from this template.


How this works: MailScanner very helpfully runs the outgoing exim
process with the command-line parameter -DOUTGOING, which
defines the macro that the above code fragment uses. Most of that
block's code is fairly self-explanatory, but you'll notice that there
are now two spool directories for exim. The default process - the
incoming mailserver that's listening on port 25 - simply drops all its
incoming messages into /var/spool/exim4_incoming. Mailscanner browses
that directory (we'll tell it to do this shortly) for these messages,
picks them up, scans them, and drops them in the outgoing spool, where
the other half of the exim system will pick them up and deliver them to
their final destination.



Configure MailScanner



By now you should have an input and an output set up, so now we need
to put something in the middle! MailScanner needs to know a number of
things, all of which are configured in
/etc/MailScanner/MailScanner.conf

Before I tell you what you need to change in there, go read it! It's
just shy of 2000 lines long, but that's because the internal
documentation is very thorough - and if you're going to be running
MailScanner, you should be familiar with this file!



Ok, now you've looked at that (well OK, you probably haven't, I'd
probably have skipped that instruction too), here's what you'll need to
set in there. Make sure you read the comments in the config file as you
set them:



  • %org-name% = YourCompany (No spaces!)

  • %org-long-name% = Your Company's Full Name

  • %web-site% = www.mycompany.com/mailscanner/info/page.html

  • Run As User = Debian-exim (The same user that the mailserver runs as)

  • Run As Group = Debian-exim

  • Incoming Queue Dir = /var/spool/exim4_incoming/input

  • Outgoing Queue Dir = /var/spool/exim4/input


There's another nasty gotcha in those two lines! Compare them with
the lines we set for exim4 above:


.ifdef OUTGOING
SPOOLDIR=/var/spool/exim4
.else
SPOOLDIR=/var/spool/exim4_incoming


They don't match. Exim thinks one directory higher than Mailscanner.
Make sure yours differ in the right way!


Back to the config options: let it know we're using exim:



  • MTA = exim

  • Sendmail = /usr/sbin/sendmail (Like most SMTP servers,
    exim4 pretends to be sendmail. Don't tamper with this setting.)

  • Sendmail2 = /usr/sbin/sendmail -DOUTGOING (This is
    where Mailscanner configures the required OUTGOING macro for our exim
    tweaks above).

  • Virus Scanners = clamav


Anything else is pretty much up to you, although for safety's sake I
set:


  • Archive Mail =
    /var/spool/MailScanner/archive




Configure SpamAssassin


Finally, you'll need one tweak to SpamAssassin. It needs to know the
organisation headers you're setting in MailScanner.conf, so it can
ignore them for the purpose of spam analysis.



In /etc/MailScanner/spam.assassin.prefs.conf, set the
lines under:

# Change according to %org-name% in # /etc/MailScanner/MailScanner.conf

as required, eg:


bayes_ignore_header X-MyCompany-MailScanner
bayes_ignore_header X-MyCompany-MailScanner-SpamCheck
bayes_ignore_header X-MyCompany-MailScanner-SpamScore
bayes_ignore_header X-MyCompany-MailScanner-Information



Fix the priveledges


Logically enough, Mailscanner's files are all installed under the
'mail' user. However for this setup they (and our mail spools) all need
to be owned by 'Debian-exim'. So, do a
'chown -R Debian-exim:Debian-exim'
on:



  • /var/lib/Mailscanner

  • /var/run/MailScanner

  • /var/spool/MailScanner

  • /var/spool/exim4_incoming/

  • /var/spool/exim4/

  • /var/lock/subsys/MailScanner


(I think that's all of them. Mailscanner or Exim will remind you when
you manually restart them if not!)




Enable it all!


In '/etc/default/mailscanner', set:

run_mailscanner=1


In '/etc/default/spamassassin', set:

ENABLED=1



And go!


With /etc/init.d/… or invoke-rc.d,
restart exim4, spamassassin and mailscanner.



Training Bayes


To make the bayesian filters more accurate, you need to train them
with "your" verified spam and non-spam ("ham")
messages. To do this, you need to get manually-classified files back
onto the Mailscanner server in an unchanged and complete format.


If you're delivering messages to an IMAP server on the same machine
as the Mailscanner server, this is easy. Each user can set up 'verified
ham' and 'verified spam' folders into which they copy files of that
type, which can then be periodically learned from by a crontab line like
the following (lines folded):


15 * * * * /usr/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf \
--mbox --spam /home/someuser/mail/verified_spam
45 * * * * /usr/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf \
--mbox --ham /home/someuser/mail/verified_ham

See the sa-learn documentation (man sa-learn is a good
start) for more info, but note that we're calling in Mailscanner's
SpamAssassin config file to make sure we learn to the right location!


Personally I only put messages into the above folders when
MailScanner/SpamAssassin get the analysis wrong, to avoid swamping the
bayes engine with low-value data.


If your IMAP server is on a remote box, you might find href="http://gopher.quux.org:70/devel/offlineimap/SVN/manual.html">offlineIMAP
useful. Details on how to use this, or how to get your mail back via
other methods, must be left, for the moment, as an exercise for the
reader!



For further help


Join the mailscanner mailing lists, located at:
href="http://www.sng.ecs.soton.ac.uk/mailscanner/support.html#mailing">http://www.sng.ecs.soton.ac.uk/mailscanner/support.html#mailing.



Credits and Copying


This article created by Richard George (wechsler@phase.org), 2005-03-03.


Many thanks to the developers of Mailscanner (particularly Julian Field) for providing the product in the first place, and to the members of the Mailscanner mailing list for their help in correcting various points in this article.


This article may be reproduced freely in any form under the conditions that:



  • Nothing is removed from, or altered in, the 'Credits and Copying' section.

  • Any changed versions are clearly marked as such.

  • A link to the original version at http://www.phase.org/journal/byjid/8550 is preserved.

  • Particular care should be taken to preserve this section if the article is placed in a wiki or other widely-modifiable environment.

Posted by parsingphase, 2005-03-03 16:43

Anonymous user

Login

Blog

Contact

I'm currently available for contract work in London as a Senior PHP Developer. Contact me for a CV, rates, or a chat.

Twitter @parsingphase
Email richard@phase.org
Github parsingphase
LinkedIn Richard George
Flickr parsingphase