Return-Path: X-Original-To: apmail-spamassassin-users-archive@www.apache.org Delivered-To: apmail-spamassassin-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A881010563 for ; Fri, 23 May 2014 03:34:02 +0000 (UTC) Received: (qmail 11462 invoked by uid 500); 23 May 2014 03:34:00 -0000 Delivered-To: apmail-spamassassin-users-archive@spamassassin.apache.org Received: (qmail 11425 invoked by uid 500); 23 May 2014 03:34:00 -0000 Mailing-List: contact users-help@spamassassin.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@spamassassin.apache.org Received: (qmail 11418 invoked by uid 99); 23 May 2014 03:34:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 May 2014 03:34:00 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [178.63.13.196] (HELO mail.rudersport.de) (178.63.13.196) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 May 2014 03:33:56 +0000 Received: from [192.168.2.2] (unknown [94.16.65.87]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.rudersport.de (Postfix) with ESMTPSA id 4A4A71762001 for ; Fri, 23 May 2014 05:33:32 +0200 (CEST) Subject: Re: I'm doing it wrong. From: Karsten =?ISO-8859-1?Q?Br=E4ckelmann?= To: users@spamassassin.apache.org In-Reply-To: References: Content-Type: text/plain Date: Fri, 23 May 2014 05:33:31 +0200 Message-Id: <1400816011.4835.140.camel@monkey> Mime-Version: 1.0 X-Mailer: Evolution 2.22.1.1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Thu, 2014-05-22 at 20:14 -0600, Kai Meyer wrote: > I have a CentOS 6 postfix + dovecot + mysql (for vmail) + spamassassin > (user prefs via mysql) server that I've been running for a few years The configuration you pasted below does not show any user_* options. Unless there are more cf files you omitted, you do not use user_prefs via SQL. > now. It's just a few of my private domains, not a lot of traffic. In the > last 6 months, the amount of spam getting through has gone from one or > two a week to 30 a day. I had sa-learn setup on imap folders called SPAM > and HAM running as root, so I just started tossing emails in there. It Training as root rather than the system user receiving the mail (and calling SA) is only possible with site-wide Bayes setup. The pasted configuration doesn't show that, either, so you would need to train as the mail receiving / scanning user. > seemed like I had groups of emails around 2, 0, -1, and -2 (my threshold > to dump to my JUNK folder is 3, and I have spamchk sideline things above > 7). I still get legitimate email in the 2-3 range, but I haven't had > legitimate email above 3 in a long time. After a bit, the 2s became 3s > and the 0s became 1s, but the -1 and -2 spam emails stayed put. I did > this habitually for more than a month, and the progress seemed to stop. > I googled around a bit and realized that I didn't do a very good job > setting up rules, so I added pyzor and razor2, and they seem functional. > Spam got better, and it's down to maybe 10 a day, but they still range > all the way up to 5. Mixing in Razor or Pyzor sure can help. But that "setting up rules" you just considered your job is a bit weird. Local rules of course also can help, but are (a) an advanced topic, and (b) not the task of a regular SA instance. You didn't mention any of that in your configuration either, so it's unclear what you're about here. > What really gets me is that if I take an email that scores -2, strip > the X-Spam* headers, and run it through spamc by hand (even as the spamd > user) just like the spamchk script does, it scores around a 4. I have It is not necessary to strip X-Spam headers. SA ignores these, if present. You just mixed in a third user, spamd -- in addition to root and the real mail receiving user. Without site-wide Bayes you are comparing apples to oranges, and now peaches. All yummy, though not the same. What is that "spamchk script" you just mentioned, and how does it fit into your setup? You should review your entire mail-processing chain. Describing it in detail might help here, too. > one here that scores a 4.1 if it comes through the mail, and a 6.6 if I > run it manually. What can I do to reconcile these scores? I would like > the scores I'm getting from the commandline over the ones I'm getting > through postfix, but I don't know the system well enough to know what is > causing the difference. Highlighting the differences, removing common rule hits: > ================== Via postfix > 0.0 HTML_IMAGE_RATIO_08 BODY: HTML has a low ratio of text to image > area > ================ Via commandline (cat test.mail | sudo -u spamd > /usr/bin/spamc -u > postsa.mail) > 2.5 URIBL_DBL_SPAM Contains an URL listed in the DBL blocklist The Bayesian probability is ~identical, merely differing a thousands. Hitting URIBL_DBL_SPAM in the later manual check, but not at receiving time may be due to timing and the URI actually getting listed later. What's odd is, that the subsequent manual check is *missing* the HTML image ratio rule triggering. Something altered the message. > ================ /etc/mail/spamassassin.cf (I added the last 4 lines in > a desperate attempt to see something change, but to no effect) > /etc/mail/spamassassin/local.cf Which one? The latter spamassassin/local.cf is default (though packager dependent), the claimed (typo'ed ?) one is custom, if it exists at all. Snip, skipping to the last four lines: > auto_learn 0 > use_razor2 > use_dcc > use_pyzor auto_learn is not a valid option. That would be bayes_auto_learn. The other use_* options require arguments (0 or 1). The lines as pasted do not enable them, and instead produce lint warnings. See spamassassin --lint That lint check is a good starting point anyway... -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}