Return-Path: Delivered-To: apmail-spamassassin-dev-archive@www.apache.org Received: (qmail 78353 invoked from network); 23 Jun 2008 08:04:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Jun 2008 08:04:32 -0000 Received: (qmail 20204 invoked by uid 500); 23 Jun 2008 08:04:32 -0000 Delivered-To: apmail-spamassassin-dev-archive@spamassassin.apache.org Received: (qmail 20180 invoked by uid 500); 23 Jun 2008 08:04:32 -0000 Mailing-List: contact dev-help@spamassassin.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spamassassin.apache.org Received: (qmail 20169 invoked by uid 99); 23 Jun 2008 08:04:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Jun 2008 01:04:32 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of matt@coders.co.uk designates 12.158.191.97 as permitted sender) Received: from [12.158.191.97] (HELO server-a8.bastionmail.co.uk) (12.158.191.97) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Jun 2008 08:03:40 +0000 X-BastionMail-MailScanner-Watermark: 1216627359.52392@y0SZ57HBh3qo/D/LSAGFqw Received: from [192.168.217.10] ([89.243.189.70]) (authenticated bits=0) by d2210.servadmin.com (8.13.8/8.13.8) with ESMTP id m5N82JvE020238; Mon, 23 Jun 2008 09:02:26 +0100 Message-ID: <485F588B.4080106@coders.co.uk> Date: Mon, 23 Jun 2008 09:02:19 +0100 From: Matt Hampton User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Justin Mason CC: dev@spamassassin.apache.org Subject: Re: Creating auto-generated rule sets..... References: <20080620090902.37ADD30C0F3@jmason.org> In-Reply-To: <20080620090902.37ADD30C0F3@jmason.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BastionMail-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: m5N82JvH020238 X-BastionMail-MailScanner: Found to be clean X-BastionMail-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-4.48, required 5, autolearn=not spam, ALL_TRUSTED -1.80, AWL2 0.05, BAYES_00 -2.60, CRM114_CHECK -0.13) X-Orginal-From: matt@coders.co.uk X-Virus-Checked: Checked by ClamAV on apache.org Hi Latest version is now avaliable >> http://www.coders.co.uk/80_sane.cf >> >> I have expanded the support for clamav rule types so have increased the base ruleset generated by about 40%. > It might be worthwhile discarding rules that are less than a certain > length, in characters. That's another thing the "sought" ruleset does. > it, again, reduces FPs nicely. > Which reduces it by 15% > Also is there any way to get it to produce _more_ rules? that 80_sane.cf > seems pretty short, compared to the 60k-rules input ;) Sounds like > 0.5% spam hits is too high a threshold, I think. > > Have tweaked this to 0.1% and is this is the basis for the ruleset avaliable above. >> It isn't automatically updating at the moment and all of the scores are >> set to 0.01 >> > > Still set to 0.01 - until I can work out a better scoring mechanism > btw you can also safely drop the "require_version" line, that only makes > sense as part of the SpamAssassin source tree. > > Haven't done that yet! matt