1. Gentoo Installation and Setup
This is how to setup SpamBayes on a Gentoo Linux machine with client access to an IMAP account.
-
Create four folders on your IMAP account called
- spam dump
-
for SpamBayes to dump spam into,
- spam for training
-
for spam that you will use to train SpamBayes to recognize spam
- possible spam
-
for messages that might be spam but SpamBayes is unsure about and
- ham for training
-
for a selection of messages that you have received which are not spam.
-
Collect spam! Copy spam messages from your inbox into spam for training folder for a day or so.
-
Place a copy of some good messages into the ham for training folder. [Note it is considered good practice to have roughly equal amounts of spam and ham for SpamBayes to train on.]
-
Python should already be installed because Portage relies on Python.
-
Download the ebuild for SpamBayes 1.0.1 (I have had problem with later builds of SpamBayes. Later versions may work for you.
-
Create a local portage repository /usr/local/portage/mail-filter/spambayes
-
Make sure that your make.conf points to /usr/local/portage
-
Add the Spambayes ebuild to your local portage spambayer folder
-
Also create a files directory and download the following files
-
bayescustomize.ini (copy bayescustomize_imap.ini to bayescustomize.ini)
-
bayescustomize_imap.ini
-
spambayesimap.rc
-
spambayespop3proxy.rc (Not necessary, but is referenced in the ebuild and will cause emerge to fail if file is missing)
-
Run emerge spambayes
-
Run ebuild spambayes-1.0.1.ebuild digest to build the digest to create the digest for emerge
-
Run emerge spambayes
-
Setup rules for SpamBayes
-
edit /etc/bayescustomize.ini
[Headers] notate_to:spam [imap] username: password: server: ham_train_folders:mail/ham spam_folder:mail/Junk spam_train_folders:mail/Junk unsure_folder:mail/unsure [Tokenizer] mine_received_headers:True summarize_email_prefixes:True summarize_email_suffixes:True x-pick_apart_urls:True [html_ui] http_authentication:Basic http_password: launch_browser: False
-
Add SpamBayes to your default init scipts
rc-update add spambayesimap default
-
a.Run web interface
sb_imapfilter.py -b or if it is in the initscript, then http://localhost:8880
2. Running SpamBayes
Do not do this until you have collected spam and ham for training. Otherwise, you will be disappointed by the results of SpamBayes.-
Start the spamfilter
/etc/init.d/spambayesimap start
-
Train SpamBayes to recogise the differences between your good mail and your spam.
-
Classify the messages in your INBOX, move spam to the spam dump folder and move messages that it cannnot easily classify into good mail or spam into the possible spam folder.
-
This will happen every 30 minutes.
-
Use your e-mail client as normal. It seems best to turn off checking for e-mail on a regular basis. Just use the Check button every now and then.
-
Every now and then, sort the possible spam folder by moving messages that you can see are spam into the spam for training folder and the good ones into ham for training. You may also want to move the good ones back into your INBOX so that you can deal with them as ordinary messages.
-
You can look at the 'spam dump' folder to see if there are any good messages wrongly classified but after a couple of days you will trust SpamBayes not to do this. Delete and expunge messages in this folder.
So that is all there is to it.
My only complaint is that I can't seem to find the syntax for the bayescustomize.ini file anywere. The web interface is nice for getting started and for checking the status of the filtered mail, but I would like to do things like kill the automatic web browser launching. launch_browser: False doesn't seem to work very well. I used it because I had found it in the pop3proxy ini file that was available with Gentoo.