This is a supplement to the article "An introduction to the Spambayes project", published in the March 2003 edition of the Linux Journal (but it's still useful even if you haven't read the article).
As promised, several things have changed since the article went to press:
The Linux Journal article points to our SourceForge project page at http://sf.net/projects/spambayes - that's the place to go for downloads, to report bugs or request features, and so on. We also now have a homepage at http://spambayes.sf.net that gives some background and documentation for the project, and has pointers to our mailing list and to related resources on the web.
There's now an installer for the software. It's still a source-code
distribution, so you still need to have Python 2.2.2 or later
installed. You can download the installer from the SourceForge project
page at http://sf.net/projects/spambayes - to install it, unpack
the archive into a temporary directory,
cd spambayes-1.0a2 then
python setup.py install. On Windows, this will install the
various applications that make up Spambayes into the
of your Python installation. On Unix, they will go into the default
location for Python scripts, usually
Don't forget that on Windows, you'll also need the
There's a binary installer available from http://pybsddb.sf.net
For people who want to use Spambayes via the POP3 proxy, there's
no longer any need to create and edit a
bayescustomize.ini - you
can configure everything through the web:
- Create a directory for Spambayes to store its data.
cdto that directory and run
python pop3proxy.py -b. Your web browser should appear, showing the Spambayes application home page. If your browser doesn't appear, you need to run it yourself and point it at the URL printed by
- Click the
Configuration pagelink, and enter the name of your POP3 server and the port number for the proxy to listen on. On Windows it's most convenient to use port 110, since that's the default for POP3. On Unix (or Unix-derived systems like MacOS X) you should use a high port like 1110. Click
Saveto save your configuration.
- Reconfigure your email client to talk to the proxy. If your
email client currently talks to
pop3.example.comon port 110, and you've configured the proxy to listen on port 1110, you should reconfigure your email client to talk to
localhost(or the name of the machine on which you're running the proxy) on port 1110.
You should now be able to collect your mail through the proxy, and see
X-Spambayes-Classification headers added to the messages. You
can now set up filters in your email client to deal with suspected spam
however you choose. All your mail will be classified as
you train the software, which you also do through the web using the
Review messages page. If you don't want to wait for messages to
arrive for training, you can use the "upload a message or mbox file"
form to train via the web interface, either on individual messages or
unix mbox files.
Privacy for the web interface
If you're worried about other people accessing your Spambayes web interface, you can configure it to only accept connections from the machine it's running on. You do this by adding this:
[html_ui] html_ui_allow_remote_connections: False
Integration with Mutt and Gnus
contrib directory of the source distribution are
spambayes.el, which let you train Spambayes from within Mutt
and Gnus - see those files for details.
Running multiple proxies on the same port
Some email clients (notably Eudora) don't let you set different ports for different POP3 servers. This is problem for Spambayes, because the POP3 proxy can only talk to one server per port. The workaround for this is to assign multiple addresses to your machine, and run one proxy per address. Here's an example (for MacOS X, but it should work similarly on any Unix-based platform). It runs two POP3 proxies, both on port 110 but on different local addresses:
#!/bin/sh sudo ifconfig lo0 inet 127.0.0.2 add sudo python pop3proxy.py
pop3proxy_servers = pop3.example1.com, pop3.example2.com pop3proxy_ports = 127.0.0.1:110, 127.0.0.2:110
Using the web training interface with procmail
If you're using
hammiefilter to classify mail via
can still use the web training interface to train Spambayes. Where
you have a procmail rule something like this:
:0fw | hammiefilter.py
you can add another rule like this:
:0fw | proxytee.py
That will upload each received message to the web interface for later
training. You need to be running
pop3proxy.py for this, but you
don't need to have any POP3 servers configured.
Note that currently
different defaults for the database location - this is something we'll
address in a future release, but for now you need to work around this
pop3proxy.py to point to the same database used by
hammiefilter.py. By default this is
~/.hammiedb - go to the web
configuration page, enter
~/.hammiedb as your database filename, and