Return-Path: Delivered-To: apmail-spamassassin-users-archive@www.apache.org Received: (qmail 57915 invoked from network); 1 Sep 2006 22:11:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 1 Sep 2006 22:11:08 -0000 Received: (qmail 25286 invoked by uid 500); 1 Sep 2006 22:10:57 -0000 Delivered-To: apmail-spamassassin-users-archive@spamassassin.apache.org Received: (qmail 25265 invoked by uid 500); 1 Sep 2006 22:10:56 -0000 Mailing-List: contact users-help@spamassassin.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@spamassassin.apache.org Received: (qmail 25256 invoked by uid 99); 1 Sep 2006 22:10:56 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2006 15:10:56 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [67.91.233.27] (HELO eclectic.kluge.net) (67.91.233.27) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Sep 2006 15:10:55 -0700 Received: by eclectic.kluge.net (Postfix, from userid 501) id 096A2AF13A; Fri, 1 Sep 2006 18:10:33 -0400 (EDT) Date: Fri, 1 Sep 2006 18:10:33 -0400 From: Theo Van Dinter To: Spamassassin Users List Subject: Re: breaking out: thinking abt the 'sa-update *VS* rdj' thread .. . Message-ID: <20060901221033.GE18729@kluge.net> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2qXFWqzzG3v1+95a" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-GPG-Keyserver: http://pgp.mit.edu/ X-GPG-Keynumber: 0xE93C82BB X-GPG-Fingerprint: 347F 2F79 9CB4 444C BD54 F270 BD32 EBA3 E93C 82BB X-GPG-URL: http://www.kluge.net/~felicity/pgp.html X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --2qXFWqzzG3v1+95a Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Wow... This mail has been sitting in my draft folder for a while, so I figured I ought to get it out. On Wed, Aug 16, 2006 at 12:24:04PM -0400, Chris Santerre wrote: > I got nothing but love for you, so here goes ;) ...... :) > > Chris! I'm surprised to hear you spreading this misinformation. > > I don't really see how the project's rule development is a=20 > > clusterfsck. > > People commit rules for testing, they get tested, if they're=20 > > good they're > > put in an update. What's the problem? >=20 > 1) Manpower. You just don't have enough people devoted to rules. Not your > fault. And solving this, would not help. Beacuse of #2... >=20 > 2) Open community. By nature the SA project has to be open. That means > public corpus, public discussion lists, and public test results. SARE woo= uld > not be as good if we had spammers watching our every move. MAJOR things we > do MUST remain private. Our good results, the rules, are made public. And= we > offer them to anyone.=20 Well, I don't think that's really true at all. A lot of things are public, some things aren't. For instance, we *don't* have a public corpus. Each person's corpus is private, and they just send in the mass-check results, which are public, but there's not a lot of information one can get out of that IMO. Test rules are public, which may or may not be problematic -- but since the goal is to have the rule made public in the end anyway, I'm not sure there's too much of an issue here. Generally speaking if test rules are good they should be published pretty quickly, so new rules still have an impact, even if spammers actively pay attention to development and adjust their mails accordingly. Based on current results, that doesn't seem to happen a lot. (currently, people tend to come up with test rules based on their own private tests on their own corpus -- when something looks good, it gets committed for wider testing, so rule development is still semi-private since the method for what rules to write is personal.) > But since SARE's inception, you can't honestly tell me that SA has kept up > with SARE's output. Be it quantity or quality.=20 I actually couldn't tell you, I specifically ignore SARE's non-donated rule= s, and I have no insight into the development process used. > But for what end? SARE gives you our best rules to be added. So what would > we gain by becoming part of SA. Seems we would lose more having to be more > open about what we do. I have thoughts about this at the end, but as far as I can see: the main project gets stronger meaning the community is better served, and there's really no downside. So why not? > open corpus vs closed. Live feed testing vs overnight GA runs. No public > eyes in our discussion lists. Incredibly easy rule testing tools vs GA ru= ns. > People in different parts of the industry more inclined to help and provi= de > info simply because of anonimity. Cross project benefits, again due to > anonimity.=20 live vs overnight mass-check runs (the GA was the tool used to generate sco= res in the 2.x days, replaced by the perceptron -- which we don't run nightly, = or weekly, etc. but that's another discussion,) is really just a matter of putting in some effort to be able to do it. We chose nightly and weekly because it seemed to be quick enough to test new rules and be able to get t= hem out, and slow enough that it doesn't necessarily scare people away from volunteering. public discussion lists -- not all of our lists are public, and the others = are generally invite-only. though we don't generally have a lot of those, and most conversation happens in personal mails anyway. "incredibly easy rule testing tools vs GA runs" -- I don't know what you guys have (is there something easier/less involved than running the rules over messages and looking at the results?), but if it's better than what's in the project currently, why not contribute it? "people in different ... anonimity" -- sure, though that's possible either way. I really don't see the issue here. > The question might be, what exactly does the SA project want of SARE? All= we > have to offer is rules, and we already give those up freely.=20 In short, I'd like to see our two groups merge. There are several issues h= ere: 1) Having multiple organizations providing rules is confusing/annoying to users, as has been discussed previously on this list. 2) Duplicated effort. Why have multiple people working on multiple rules that do the same thing? That's inefficient in various ways. 3) The SA project can't take the rules from SARE's site, they have to be contributed. That doesn't actually happen very often. Most (all?) of the SARE people who currently have commit access to the SA project haven't made commits in a long time, if ever. 4) The SA project, as previous discussed, no longer has the manpower to deal with both the engine and the rules with the detail and attention that they deserve. This is bad. 5) Last, and perhaps most importantly, the SA project is the foundation of the community around it, unsurprisingly. If the project doesn't work well, be it engine or rules, people will give up and go elsewhere. Then both groups, and all of our combined effort, goes to waste. That's very very bad. I think we'd all be better served by having a single project w/ lots of active development, than two semi-related projects which end up duplicating effort and competing for the same scarse set of resources. --=20 Randomly Generated Tagline: "I am returning this otherwise good typing paper to you because someone has printed gibberish all over it and put your name at the top." - English Prof. at Ohio University --2qXFWqzzG3v1+95a Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFE+K/ZvTLro+k8grsRAnYgAJ0cv5VnsAUDRIpJQrHJXyXfB4641QCdFgQk vHgzGJ9qEsg1LEIrjAs42n8= =FROW -----END PGP SIGNATURE----- --2qXFWqzzG3v1+95a--