spamassassin-sysadmins mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Jones <da...@apache.org>
Subject Re: sa-update ruleset updates enabled again
Date Tue, 21 Nov 2017 13:51:20 GMT
On 11/21/2017 05:36 AM, Kevin A. McGrail wrote:
> On 11/21/2017 12:33 AM, Matthias Leisi wrote:
>> In addition to server-side blocking, would it make sense for 
>> sa-update to rate-limit itself?
>>
>> — Matthias
> Yes, good idea.  Added to my notes.
>

Based on my logs, it appears the 3.4.x sa-update's check the DNS TXT 
record, compare that to it's last update, and don't pull anything if it 
was the same.  From the user agents in the web logs, it just looks like 
very old 3.3.x sa-updates aren't behaving so they need the mod_evasive.  
Then there are curls that are pounding away with the vast majority (> 
80%) of hits that also need mod_evasive/.htaccess.

Did some versions of sa-update ever use curl to do the fetching or are 
these home-grown scripts trying to pull down for their own private mirror?

>>
>> Von meinem iPhone gesendet
>>
>>> Am 21.11.2017 um 03:53 schrieb Kevin A. McGrail 
>>> <kevin.mcgrail@mcgrail.com>:
>>>
>>>> On 11/20/2017 7:17 PM, Dave Jones wrote:
>>>> Could we use something like mod_evasive to limit any IP connecting 
>>>> more than 3 times (one batch of ruleset files) an hour? SA 
>>>> instances behind NAT'd IPs could cause a legitimate reason for more 
>>>> than 2x hits per day.
>>> I'd like to keep it simpler for now.  The abuse hasn't been too bad.
>>>
>>> I've put them on notice on the users@ list and I'm going to look at 
>>> adding more information such as a unique id to sa-update's call for 
>>> wget/curl so we can identify NAT'ing.
>>>
>>>> There may be some abusers in the future that we would want to 
>>>> permanently block with a centralized .htaccess file that gets 
>>>> distributed with the normal rsync pulls by each mirror.
>>> Agreed.  Let's keep an eye on things.
>>>
>>> So from the last 3.8mm GETs Top 14 IPs
>>>
>>> (grep GET sa-update.pccc.com-access_log | awk -F" " '{ print $1 }' | 
>>> sort | uniq -c | sort -n -r | head -n 14)
>>>
>>>   964649 52.169.9.191 (Machine we already had taken care of)
>>>    71273 176.61.138.136
>>>    40397 41.76.211.56
>>>    22535 108.163.197.66
>>>    21100 108.61.28.10
>>>    21037 79.137.36.178
>>>    20270 149.56.17.151
>>>    19826 91.204.24.253
>>>    18141 178.32.88.139
>>>    18003 207.210.201.60
>>>    14037 158.69.200.153
>>>    12539 78.229.96.116
>>>    12525 37.221.192.173
>>>    11568 45.77.52.43

Looks like some overlaps in the top IPs.  For the most part people are 
being good with sa-update.  Seems to be a small percentage that we need 
to address with mod_evasive and/or .htaccess.  My 2 mirrors for 
sa-update.ena.com hit about 63 GB yesterday.  The hourly download volume 
seem to match the rule promotions at 2:30 AM UTC and 9 AM UTC.  The 
majority of the daily total 63 GB was 56 GB from 2:30 AM UTC to about 6 
AM UTC.  This is the middle of the night for US Central TZ which works 
out pretty well for us.

>>>>>> Here are the top 10 IPs that seem to be running sa-update or a 
>>>>>> curl script most frequently:
>>>>>>
>>>>>> 41.76.211.56 (sa-update/svn917659/3.3.2 every 5 minutes)
>>>>>> 108.61.28.10 (sa-update/svn917659/3.3.2 every 15 minutes)
>>>>>> 202.191.60.145 (curl/7.19.7 every minute rotating mirrors)
>>>>>> 202.191.60.146 (curl/7.19.7 every minute rotating mirrors)
>>>>>> 108.163.197.66 (sa-update/svn917659/3.3.2 every 5 minutes)
>>>>>> 208.74.121.106 (NAT'd IP? curl/7.29.0 & curl/7.19.7)
>>>>>> 91.204.24.253 (NAT'd IP? various user agents)
>>>>>> 207.210.201.60
>>>>>> 78.110.96.3
>>>>>> 190.0.150.3
>>>>>>
>>>>>> -- 
>
>


Mime
View raw message