spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "richard@buzzhost.co.uk" <rich...@buzzhost.co.uk>
Subject Re: DNSBL Comparison 20091114
Date Mon, 16 Nov 2009 06:00:07 GMT
On Sun, 2009-11-15 at 20:34 +0000, Justin Mason wrote:
> On Sun, Nov 15, 2009 at 08:53, richard@buzzhost.co.uk
> <richard@buzzhost.co.uk> wrote:
> > On Sun, 2009-11-15 at 03:14 -0500, Warren Togami wrote:
> >> http://mail-archives.apache.org/mod_mbox/spamassassin-users/200910.mbox/%3C4AD11C44.9030201@redhat.com%3E
> >> Compare this report to a similar report last month.
> >>
> >> http://wiki.apache.org/spamassassin/NightlyMassCheck
> >> The results below are only as good as the data submitted by nightly
> >> masscheck volunteers.  Please join us in nightly masschecks to increase
> >>   the sample size of the corpora so we can have greater confidence in
> >> the nightly statistics.
> >>
> >> http://ruleqa.spamassassin.org/20091114-r836144-n
> >> Spam 131399 messages from 18 users
> >> Ham  189948 messages from 18 users
> >>
> >> ============================
> >> DNSBL lastexternal by Safety
> >> ============================
> >> SPAM%    HAM%    RANK RULE
> >> 12.8342% 0.0021% 0.94 RCVD_IN_PSBL *
> >> 12.3053% 0.0026% 0.94 RCVD_IN_XBL
> >> 31.2499% 0.0827% 0.87 RCVD_IN_ANBREP_BL *2
> >> 80.2578% 0.1485% 0.86 RCVD_IN_PBL
> >> 27.1836% 0.1985% 0.79 RCVD_IN_SORBS_DUL
> >> 19.8213% 0.1785% 0.79 RCVD_IN_SEMBLACK *
> >> 90.9360% 0.3854% 0.77 RCVD_IN_BRBL_LASTEXT
> >> 13.0564% 0.4838% 0.67 RCVD_IN_HOSTKARMA_BL *
> >>
> >> Commentary:
> >> * PSBL and XBL lead in apparent safety.
> >> * ANBREP was added after the October report and has made a surprisingly
> >> strong showing in this past month.  ANBREP is currently unavailable to
> >> the general public.  The list owner is thinking about going public with
> >> the list, which I would encourage because they are clearly doing
> >> something right.  It seems he would need a global network of automated
> >> mirrors to be able to scale.  He would also need listing/delisting
> >> policy clearly stated on a web page somewhere.
> >> * SEMBLACK consistently has been performing adequately in safety while
> >> catching a respectable amount of spam.  I personally use this
> >> non-default blacklist.
> >> * It is clear that the two main blacklists are Spamhaus and BRBL.  The
> >> Zen combinatoin of Spamhaus zones is extremely effective and generally
> >> safe.  BRBL has a high hit rate as well, with a moderate safety rating.
> >> * HOSTKARMA_BL ranks dead last in safety for the past several weeks in a
> >> row, while not being more effective against spam than PSBL, XBL or SEMBLACK.
> >>
> >> ===============================
> >> HOSTKARMA_BL much better as URIBL
> >> ===============================
> >> SPAM%    HAM%    RANK RULE
> >> 68.3651% 0.2806% 0.79 URIBL_HOSTKARMA_BL *
> >>
> >> Commentary:
> >> While HOSTKARMA_BL is pretty unsafe as a plain DNSBL, it is surprisingly
> >> effective as a URIBL.  This is curious as it seems it was not designed
> >> to be used as a URIBL.  In any case as long our masschecks show good
> >> statistics like this, I will personally use this on my own spamassassin
> >> server.
> >>
> >> =========================
> >> SPAMCOP Dangerous?
> >> =========================
> >> SPAM%    HAM%    RANK RULE
> >> 17.4225% 2.6076% 0.56 RCVD_IN_BL_SPAMCOP_NET *
> >>
> >> Commentary:
> >> Is Spamcop seriously this bad?  It consistently has shown a high false
> >> positive rates in these past weeks.  Was it safer than this in the past
> >> to warrant the current high score in spamassassin-3.2.5?
> >>
> >> Warren Togami
> >> wtogami@redhat.com
> >
> > Is it not a bit flawed to do the metrics on volunteer submissions, given
> > the Spamhaus has is said to have a small army of them? It means the data
> > cannot be relied upon as any kind of sensible comparison.
> 
> please explain.  How would you suggest measuring false positives?
> 
Do you think that volunteer submissions are an accurate way to do them,
or do you think that is open to abuse?

For example, say I am Steve Linford with a small army of volunteers. I
get a few false positives come in from Spamhaus, and a few from SORBS.
What is my inclination when I submit the data?

It takes only a small amount of research and a trawl through the NANAE
archives to get a handle on the problem, and the general abuse and
nefarious goings on with DNSBL volunteers. It is fair to say that there
is not much love lost.

I'm not pretending I have the answers, so it's probably better to take
these lists with a large bucket of salt and find how any given DNSBL
list works for a given organisation.
 
In a world where presidents and world leaders in America, Zimbabwe and
Afghanistan get 'elected' on tainted data, some random RBL 'comparison'
list is a trivial by comparison. It must, however, be duly remembered
that there are many competing 'sides' in the world of the DNSBL's, each
looking to do the other discredit.

Perhaps Jim, as you posed the question - you have some strong feelings
on the matter that you would like to share?


Mime
View raw message