UserPreferences

TrainOnErrors


Note: This wiki is now frozen; you can no longer edit it, and no interactive features work.

1. Train On Errors

Training on errors only (or mistakes) leads to the slowest growth of the token databases and many users operate the system this way, since it is so easy. Unsures are not errors, as Skip pointed out above. With this training tactic, we only train on outright errors (spam in the Inbox, ham in the spam folder). Depending on the ham/spam thresholds, this type of training only adds the most aberrant (compared to our training set) messages. Surely, they need to be added, but over time, the database will not necessarily be an accurate representation of the current mail stream. I would guess that this tactic would result in having to start over more often than the other methods described below, though I have no proof. In addition, some of the developers have commented on the mailing list that mistake-based training has a tendency to be "brittle", i.e. not robust.

SethGoodman