spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ned Slider <>
Subject Re: Lowering spam threshold
Date Wed, 06 Jul 2011 18:46:08 GMT
On 06/07/11 09:17, Lars Jørgensen wrote:
>> I think many people run with tag at 5.0 and discard at 10.0
> I should have mentioned that we are running amavisd-new. I thought that was the de facto
way of integrating spamassassin into a mail gateway, but reading this list reveals that most
people probably doesn't do that. Makes me wonder if I am doing the wrong thing?
> Amavisd-new has further settings as to thresholds, and these are the ones I put in as
of today (after reading other peoples tips here, thank you everybody):
> $sa_tag_level_deflt  = -10;  # add spam info headers if at, or above that level
> $sa_tag2_level_deflt = 5.2;  # add 'spam detected' headers at that level
> $sa_kill_level_deflt = 6.2;  # triggers spam evasive actions (e.g. blocks mail)
> $sa_dsn_cutoff_level = 7.4;  # spam level beyond which a DSN is not sent
> Does above scores make sense?

Yes, makes perfect sense to other amavisd-new users. I currently tag at 
5.0 (the default SA score) and quarantine at 6.0. I also set the DSN 
cut-off level to be the same as quarantine as I don't want to send DSNs.

If you are finding spam is getting through untagged with the default SA 
score of 5.0 then I would look to write some additional rules to target 
those spam that are getting through rather than lowering the score below 
the SA default of 5.0. This list can help you with that if you provide 

Additionally, I have very carefully hand trained bayes with only 
confirmed spam/ham and tweaked the scores to be more representative of 
the faith I have in my bayes data. I find many cases where bayes alone 
will identify spam and have scored bayes_99 accordingly.

The main "problem" I see with SA is that I reject all the easy spam 
(>90%) at the smtp level so SA only really gets to see the more 
difficult and less obvious stuff. If SA saw all spam then the detection 
rates out of the box would be extremely high, but with only the more 
difficult samples to chew on detection rates inevitably drop and are 
artificially lowered. As a result it can appear that a lot of spam is 
getting through when in reality the overall percentage is still really 
small. That last 1% is just hard to catch without increasing the risk of 
false positives.

View raw message