Justin,
> Mark Martinec writes:
> > As a curiosity (but off topic), harvesting results from p0f
> > (passive operating system fingerprinting), here are two more:
> > http://www.ijs.si/software/amavisd/fig1.gif
> > Spam score vs. IP distance in hops (our server is
> > in European academic network Geant)
> > And perhaps most interesting of all (by again OT):
> > http://www.ijs.si/software/amavisd/fig2.gif
> > Spam score distribution as a percentage of all mail,
> > separate by each sending mail client's operating system.
> That's excellent data! Mind if I forward that around to another
> list or two?
I don't mind.
> The "hops" measurement is particularly interesting. Have you got that
> implemented as a working rule, in the field? is it expensive?
Yes, implemented in the field - comes with the latest amavisd-new-2.4.0.
It inserts one header field with collected information into mail header,
making it available to SA to score it as it wishes (custom rules, bayes).
It could probably just as well be implemented as a SA plugin (making use
of the supplied lightweight p0f-analyzer.pl interface to p0f), but it was
easier for me to do it in amavisd-new, where remote SMTP client's IP address
is accessible directly, not needing to parse header and understand topology.
It is reasonably inexpensive: cost of running p0f utility is comparable to
running tcpdump, it takes about one hour CPU per month on our medium-busy
mailer, the rest is negligible, no additional latencies and no additional
network traffic.
The most interesting part in my view is not the IP distance, but the
type of OS, illustrated by the following table (derived from the same
data as fig2):
p0f OS guess ham : spam
-----------------------------
Windows-XP 0.7 % : 99.3 %
Windows-2000 5.8 % : 94.2 %
UNKNOWN 16.5 % : 83.5 %
Linux 58.8 % : 41.2 %
Unix 80.3 % : 19.7 %
(Unix+Linux 66.5 % : 33.5 %)
Only 0.7% of all mail coming from Windows-XP hosts is ham!!!
It is an ideal information to contribute two or three score points.
Traffic from own PC clients must not be seen by p0f, otherwise one would
be penalizing site's own user. This can be achieved by either separating
MSA from MTA, or using list of internal IP networks for exclusion.
A quick summary from amavisd-new-2.4.0 release notes:
- experimental support for passive operating system fingerprinting with
the use of externally running utility p0f, supplying collected information
as a header field to SpamAssassin, making possible to add rules to score
SMTP client hosts based on educated guess about their operating system
type and IP distance; see below for details;
Here are the installation details:
- passive operating-system fingerprinting (p0f) support lets SA gain
information about SMTP client's operating system and estimated IP distance,
and can reduce the number of bounces:
* find and install the p0f utility: http://lcamtuf.coredump.cx/p0f.shtml
or in FreeBSD ports collection as 'net-mgmt/p0f';
* start a p0f process on the same host where MTA (MX) is running, making
it listen only to incoming TCP sessions (to reduce its workload) to the
IP address and TCP port (25) where MTA is accepting incoming mail from
outside (it doesn't hurt to let it see other traffic too, it just isn't
needed); after testing p0f alone and seeing that it works, you may start
it up, feeding its output to program p0f-analyzer.pl that comes with
amavisd-new package, e.g.:
p0f -l 'tcp dst port 25' 2>&1 | p0f-analyzer.pl 2345 &
on multi-homed boxes one may need to specify interface and IP address
where MTA is listening, the filter syntax is the same as in tcpdump, e.g.:
p0f -l -i bge0 'dst host 192.0.2.66 and tcp dst port 25' 2>&1 \
| p0f-analyzer.pl 2345 &
* the program p0f-analyzer.pl reads p0f reports on stdin, keeps a cache
for a limited time (10 minutes, configurable) of data about incoming TCP
sessions organized by remote IP address, and listens on UDP port 2345
(specified as its command line argument) for queries; only queries from
allowed IP addresses are accepted and responded to, other queries are
silently ignored - configure @inet_acl accordingly, defaults to 127.0.0.1;
* adding the following line to amavisd.conf, matching the chosen port
number to the one specified on the command line to the p0f-analyzer.pl:
$os_fingerprint_method = 'p0f:127.0.0.1:2345';
makes amavisd send queries to p0f-analyzer.pl (on the supplied IP address
and UDP port number) to collect information about remote SMTP client's OS;
collected response is then supplied as a header field when SpamAssassin
is invoked; query/response is very quick and imposes no burden on amavisd
process nor does its extend its processing time. The $os_fingerprint_method
setting is also a member of policy banks to make it more flexible to
disable fingerprinting for mail from site's own SMTP clients, e.g:
$policy_bank{'MYNETS'}{os_fingerprint_method} = undef;
* one may now add scoring rules to SA local.cf file, e.g.:
header L_P0F_WXP X-Amavis-OS-Fingerprint =~ /^Windows XP/
score L_P0F_WXP 3.5
header L_P0F_W X-Amavis-OS-Fingerprint =~ /^Windows(?! XP)/
score L_P0F_W 1.7
header L_P0F_UNKN X-Amavis-OS-Fingerprint =~ /^UNKNOWN/
score L_P0F_UNKN 0.8
header L_P0F_Unix X-Amavis-OS-Fingerprint =~ /^((Free|Open|Net)BSD)|Solaris|HP-UX|Tru64/
score L_P0F_Unix -1.0
It is also possible to add score based on estimated IP distance, for
example to slightly favorize nearer hosts (this is probably good for Europe
or academic/university networks, and possibly less useful elsewhere):
header L_P0F_D1234 X-Amavis-OS-Fingerprint =~ /\bdistance [1-4](?![0-9])/
header L_P0F_D5 X-Amavis-OS-Fingerprint =~ /\bdistance 5(?![0-9])/
header L_P0F_D6 X-Amavis-OS-Fingerprint =~ /\bdistance 6(?![0-9])/
header L_P0F_D7 X-Amavis-OS-Fingerprint =~ /\bdistance 7(?![0-9])/
header L_P0F_D8 X-Amavis-OS-Fingerprint =~ /\bdistance 8(?![0-9])/
header L_P0F_D9 X-Amavis-OS-Fingerprint =~ /\bdistance 9(?![0-9])/
header L_P0F_D10 X-Amavis-OS-Fingerprint =~ /\bdistance 10(?![0-9])/
header L_P0F_D11 X-Amavis-OS-Fingerprint =~ /\bdistance 11(?![0-9])/
score L_P0F_D1234 -0.5
score L_P0F_D5 -0.5
score L_P0F_D6 -0.5
score L_P0F_D7 -0.5
score L_P0F_D8 -0.5
score L_P0F_D9 -0.4
score L_P0F_D10 -0.3
score L_P0F_D11 -0.3
* make sure the @mynetworks is configured correctly, otherwise you will be
inappropriately penalizing mail from internal hosts running Windows!
Other methods to turn off fingerprinting for our own SMTP client hosts
is to put $os_fingerprint_method in policy banks, and/or to specify
more selective packet filter on the p0f command line;
* based on statistics, less than 0.7 % of mail coming from external
Windows XP -based hosts is ham, yet 20 % of all spam is coming from
external Windows XP hosts; amavisd-new suppresses bounces to external
Windows XP hosts, reducing bounce pollution. The amavisd-agent utility
now provides some additional statistics based on p0f information.
Some statistics collected from our logs in February 2006:
p0f OS guess ham : spam
-----------------------------
Windows-XP 0.7 % : 99.3 %
Windows-2000 5.8 % : 94.2 %
UNKNOWN 16.5 % : 83.5 %
Linux 58.8 % : 41.2 %
Unix 80.3 % : 19.7 %
(Unix+Linux 66.5 % : 33.5 %)
(ham: mail with score below 3, spam: score above 6)
Mark
|