spamassassin-sysadmins mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fossies Administrator <Jens.Schleuse...@fossies.org>
Subject Some interesting (?) observations on a mirror server (sa-update.fossies.org)
Date Thu, 20 Sep 2018 19:50:57 GMT
Hi,

incidentally I looked some weeks ago on the web server access log file of 
the SpamAssassin rules update files mirror sa-update.fossies.org and found 
surprisingly that at noon (midday) the log file has a size much more than 
the roughly expected half of a complete daily log.

Just for curiosity I plotted the number of the GET requests for update 
files (tarballs) per hour and saw an interesting characteristics with a 
great peak between 6 and 7 a.m. (GMT+2). Ok, the main reason is probably 
the publication time (mostly between 5 and 6 a.m. GMT+2) with a delay til 
the user's sa-update scripts are running. But the structure of the curves 
with the some curious (?) mimima is a little bit "surprisingly" to me but 
it is constant and reproducible.

A simple example text plot for a single day is attached (more accurate 
plots are available under the URL given below).

But more interesting and "irritating" was the fact that I found in the 
main update time often (at least 100-1000) entries with the HTTP status 
404 ("Not Found"). That motivated me to write a primitive script to 
analyze the reason by monitoring the update status resp. update times of 
the new published rules update files.

First I checked the local web log files assuming that a 404 request to an 
update file means that an external client had the information about a new 
file that the local mirror sa-update.fossies.org has not yet available 
resp. not yet fetched (via rsync).

Additionally I checked the local DNS server (of the server provider) and 
the DNS servers I found responsible for the domain spamassassin.org

  ns2.pccc.com.
  ns2.ena.com.
  c.auth-ns.sonic.net.
  b.auth-ns.sonic.net.
  a.auth-ns.sonic.net.

via the command

  dig @<server> 3.3.3.updates.spamassassin.org txt +short

The plots and an extract of the script output you can find under

  https://fossies.org/~schleusener/sa-update.mirror_analysis/
   User: sa
   PW: update

The main reason for the 404 errors seems to be that the mirroring script 
is started as cronjob on sa-update.fossies.org only every 10 minutes.

Probably better would be to check the original nameservers (the local 
nameserver answers according the TTL only with a freshness delay of max. 
one hour) and start only a rsync job if the response shows that a new file 
is available.

If all mirror servers would use update frequencies not smaller than 10 
minutes an idea may be also to set/change the DNS TXT entry only 10 
minutes after the release (availability) of a new update file.

Additionally I found that the synchronization of the above DNS servers 
seems delayed by some minutes. The "best" DNS server seems to be 
"ns2.ena.com" since it always as first one provides the new versions.

Maybe this behaviour is a little bit related to the current thread with 
the subject "repeated sa-update problems" on the users list.

Regards

Jens

-- 
FOSSIES - The Fresh Open Source Software archive
mainly for Internet, Engineering and Science
https://fossies.org/
Mime
View raw message