spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jdow <j...@earthlink.net>
Subject Re: SOUGHT 2.0
Date Fri, 05 Dec 2014 20:36:28 GMT
On 2014-12-05 08:28, Axb wrote:
> On 12/05/2014 05:20 PM, Kris Deugau wrote:
>> Axb wrote:
>>> On 12/05/2014 01:15 AM, Ian Zimmerman wrote:
>>>> On Thu, 04 Dec 2014 22:41:13 +0100,
>>>> Axb <axb.lists@gmail.com> wrote:
>>>>
>>>> Axb> To be able to create usable rules, several times/day I need feeds
>>>> Axb> to spit *at least* +150k/day. As I don't have the data....
>>>>
>>>> 150k of what?  Bytes?  Emails?  Tokens?
>>>
>>> Sorry, thought this was obvious...
>>>
>>> SOUGHT type rule generation extracts txt strings from spams so it means
>>> +150k spams/day
>>
>> It seems to work reasonably well for me with ~2-3K each ham and spam,
>> and even provides a handful of subrules even with ~225 spam subtype
>> messages.  (I generate a number of sets of rules with different subtypes
>> of spam.)
>>
>> It's probably not nearly as *effective* as it could be with larger
>> working sets.
>
> Agreed.
>
> ... I use about 5-15k from the last 8 hrs (amount varies dramatically) per rule
> gen run *for local* use, but that's hardly representative for global coverage.

Add LKML to your large batch of training email and I bet you get "interesting" 
results, at best.

And one must always remember that one person's spam is another person's ham.

{o.o}

Mime
View raw message