spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hardin <jhar...@impsec.org>
Subject Re: BAYES_00
Date Sat, 06 Oct 2012 19:36:49 GMT
On Sat, 6 Oct 2012, Arthur Dent wrote:

> Following a hard drive crash I am rebuilding my small home server on a
> Fedora17 platform.
>
> One of the casualties of the HD crash was my spam corpus. I had a (very
> old) backup which happened to include a previous spam corpus so I used
> that to sa-learn.
>
> All my messages hit BAYES_00.

Well, you're probably going to have to re-train from scratch.

Review every message in your training corpora to ensure they are properly 
classified.

Add a bunch of new ham and, if you have any, new spam.

Very old spam (say, >5 years) may not be too useful, and probably should 
be omitted, unless you have a very small spam corpus.

Turn off autolearn. I'm in a similar situation and hand-training on the 
rare misses works great for me.

Also, given your low volume, I would recommend quarantining all spam, and 
not having a discard threshold score over which spams are thrown out 
unseen. Any that do get delivered can be reviewed and added to your 
spam training corpus.

Zap your Bayes database, re-train and see how it goes.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...wind turbines are not meant to actually be an efficient way to
   supply the power grid, rather they're prayer wheels for New Age
   iBuddhists, their whirring blades drawing white guilt from the
   atmosphere and pumping it safely underground.                -- Tam
-----------------------------------------------------------------------
  Tomorrow: the first private ISS resupply mission (SpaceX/Dragon)

Mime
View raw message