SpamCopAndAssassin

Code to add to tokenize_headers() in tokenizer.py to generate tokens from

SpamAssassin headers. There has been debate about whether this is wise, or if the classifiers should be independant, but if you want it, it's here.

Written and posted to spambayes@python.org by Mathew Hendry - thanks!

spamassassin_re = re.compile(r'tests=([A-Z0-9,_]+)')
...
        # X-Spam-Status:
        # Added by SpamAssassin (http://www.spamassassin.org)
        line = msg.get('x-spam-status')
        if line is not None:
            line = ''.join(line.split())
            for rules in spamassassin_re.findall(line):
                for rule in rules.split(','):
                    yield 'spamassassin:' + rule

        # X-SpamCop-Disposition:
        # Added by SpamCop Mail service (http://www.spamcop.net)
        line = msg.get('x-spamcop-disposition')
        if line is not None:
            for token in line.lower().split():
                yield 'spamcop:' + token

Note: This wiki is now frozen; you can no longer edit it, and no interactive features work.