spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reindl Harald <h.rei...@thelounge.net>
Subject Re: why does SA without autolearn need bayes read-write?
Date Wed, 28 Jan 2015 16:02:43 GMT

Am 28.01.2015 um 16:52 schrieb Axb:
> On 01/28/2015 04:38 PM, Reindl Harald wrote:
>>
>> is AFAIK relevant in context of sa-learn to not re-train the same
>> messages again and again - and it has it's own bugs becaue for a few
>> messages it contains random parts of the message itself, fire sa-learn
>> on the whole corpus would add these messages each time to "bayes_toks"
>>
>> see two example snippets below
>> hence it is that large here
>>
>> -rw------- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal
>> -rw------- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen
>> -rw------- 1 sa-milt sa-milt  40M 2015-01-28 16:33 bayes_toks
>> -rw------- 1 sa-milt sa-milt   98 2014-08-21 17:47 user_prefs
>> _________________________________________________
>
> something here does NOT make sense
>
> 1.3 MB of seen against 40MB tokens.
>
> someone please correct me if I'm wrong:
>
> afaik, this probably means you've deleted bayes_seen so bayes has lost
> it's record of what it has processed so it will relearn stuff you
> already fed it.

no, i explained what happens in the part you stripped from the quote - 
it contains randomly complete message parts independent how often i 
delete *any file* in the userhome and rebuild from scratch

if i delete "bayes_seen" than it happens by a complete reset with 
sa-learn.sh using sa-learn to *rebuild from scratch* based on the 
forever stored raw-mails in the folders "ham" and "spam"


Mime
View raw message