From sysadmins-return-2371-archive-asf-public=cust-asf.ponee.io@spamassassin.apache.org Thu Sep 20 22:13:44 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 90B84180671 for ; Thu, 20 Sep 2018 22:13:43 +0200 (CEST) Received: (qmail 4795 invoked by uid 500); 20 Sep 2018 20:13:42 -0000 Mailing-List: contact sysadmins-help@spamassassin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: sysadmins@spamassassin.apache.org Delivered-To: mailing list sysadmins@spamassassin.apache.org Delivered-To: moderator for sysadmins@spamassassin.apache.org Received: (qmail 62178 invoked by uid 99); 20 Sep 2018 19:51:13 -0000 X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.002 X-Spam-Level: X-Spam-Status: No, score=-0.002 tagged_above=-999 required=6.31 tests=[SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Date: Thu, 20 Sep 2018 21:50:57 +0200 (CEST) From: Fossies Administrator To: sysadmins@spamassassin.apache.org Subject: Some interesting (?) observations on a mirror server (sa-update.fossies.org) Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: multipart/mixed; BOUNDARY="-913647343-523047512-1537473057=:14588" X-Spam-Scan-Host: JS-fossies X-Scanned-By: MIMEDefang 2.84 on 138.201.17.217 ---913647343-523047512-1537473057=:14588 Content-Type: text/plain; format=flowed; charset=US-ASCII Hi, incidentally I looked some weeks ago on the web server access log file of the SpamAssassin rules update files mirror sa-update.fossies.org and found surprisingly that at noon (midday) the log file has a size much more than the roughly expected half of a complete daily log. Just for curiosity I plotted the number of the GET requests for update files (tarballs) per hour and saw an interesting characteristics with a great peak between 6 and 7 a.m. (GMT+2). Ok, the main reason is probably the publication time (mostly between 5 and 6 a.m. GMT+2) with a delay til the user's sa-update scripts are running. But the structure of the curves with the some curious (?) mimima is a little bit "surprisingly" to me but it is constant and reproducible. A simple example text plot for a single day is attached (more accurate plots are available under the URL given below). But more interesting and "irritating" was the fact that I found in the main update time often (at least 100-1000) entries with the HTTP status 404 ("Not Found"). That motivated me to write a primitive script to analyze the reason by monitoring the update status resp. update times of the new published rules update files. First I checked the local web log files assuming that a 404 request to an update file means that an external client had the information about a new file that the local mirror sa-update.fossies.org has not yet available resp. not yet fetched (via rsync). Additionally I checked the local DNS server (of the server provider) and the DNS servers I found responsible for the domain spamassassin.org ns2.pccc.com. ns2.ena.com. c.auth-ns.sonic.net. b.auth-ns.sonic.net. a.auth-ns.sonic.net. via the command dig @ 3.3.3.updates.spamassassin.org txt +short The plots and an extract of the script output you can find under https://fossies.org/~schleusener/sa-update.mirror_analysis/ User: sa PW: update The main reason for the 404 errors seems to be that the mirroring script is started as cronjob on sa-update.fossies.org only every 10 minutes. Probably better would be to check the original nameservers (the local nameserver answers according the TTL only with a freshness delay of max. one hour) and start only a rsync job if the response shows that a new file is available. If all mirror servers would use update frequencies not smaller than 10 minutes an idea may be also to set/change the DNS TXT entry only 10 minutes after the release (availability) of a new update file. Additionally I found that the synchronization of the above DNS servers seems delayed by some minutes. The "best" DNS server seems to be "ns2.ena.com" since it always as first one provides the new versions. Maybe this behaviour is a little bit related to the current thread with the subject "repeated sa-update problems" on the users list. Regards Jens -- FOSSIES - The Fresh Open Source Software archive mainly for Internet, Engineering and Science https://fossies.org/ ---913647343-523047512-1537473057=:14588 Content-Type: text/plain; name=sa-update.mirror_requests.180813.txt Content-Transfer-Encoding: BASE64 Content-Description: Content-Disposition: attachment; filename=sa-update.mirror_requests.180813.txt ICAgICAgICBOdW1iZXIgb2YgaG91cmx5IEdFVCByZXF1ZXN0cyBmb3IgU3Bh bUFzc2Fzc2luDQogICAgICAgIHVwZGF0ZSBmaWxlcyBvbiBtaXJyb3IgInNh LXVwZGF0ZS5mb3NzaWVzLm9yZyINCg0KICA0MDA4ICstLS0tLS0tLS0rLS0t LS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLS0rLS0tLS0tLSsNCiAgICAgICB8 ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICB8DQogICAgICAgfCAgICAgICAgICAgICsgICAgICAgICAgICByZXF1ZXN0 cy9ob3VyICstLS0tLSsgfA0KICAgICAgIHwgICAgICAgICAgICB8ICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgIHwNCiAgMzAwOCArICAgICAg ICAgICB8IHwgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICArDQog ICAgICAgfCAgICAgICAgICAgfCAgfCAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgfA0KICAgICAgIHwgICAgICAgICAgfCAgICsgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgIHwNCiAgICAgICB8ICAgICAgIC0rLSsg ICAgfCAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB8DQogIDIwMDgg KyAgICAgICsgICAgICAgIHwgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgKw0KICAgICAgIHwgICAgIC8gICAgICAgICAgfCAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgIHwNCiAgICAgICB8IC0rLSsgICAgICAgICAgICst ICAgICAgICAgICAgICAgICAgICAgICAgICAgICB8DQogICAgICAgfCsgICAg ICAgICAgICAgICAgICstKy0rLSAgICAgICAgICAgICAgICAgICAgICAgfA0K ICAxMDA4ICsgICAgICAgICAgICAgICAgICAgICAgICArLSAgICAgICAgICAg ICAgIC0rLSstICsNCiAgICAgICB8ICAgICAgICAgICAgICAgICAgICAgICAg ICArLSAgICAgICAgICAgLSsgICAgICt8DQogICAgICAgfCAgICAgICAgICAg ICAgICAgICAgICAgICAgICArLSAtKy0rLSstKyAgICAgICAgfA0KICAgICAg IHwgICAgICAgICAgICAgICAgICAgICAgICAgICAgICArICAgICAgICAgICAg ICAgIHwNCiAgICAgMCArLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLS0r LS0tLS0tLS0tKy0tLS0tLS0rDQogICAgICAgMCAgICAgICAgIDUgICAgICAg ICAxMCAgICAgICAgMTUgICAgICAgIDIwICAgICAgMjQNCiAgICAgICAgICAg ICAgICAgICAgICAgICAgIGhvdXIgKEdNVCsyKQ0KDQo= ---913647343-523047512-1537473057=:14588--