spamassassin-sysadmins mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. McGrail" <kevin.mcgr...@mcgrail.com>
Subject Re: sa-vm1.apache.org is unresponsive
Date Tue, 09 Jan 2018 13:13:20 GMT
Offlist reply by accident.  Repeating...

On 1/8/2018 9:52 AM, Dave Jones wrote:
> On 01/08/2018 08:06 AM, Kevin A. McGrail wrote:
>> On 1/8/2018 8:01 AM, Dave Jones wrote:
>>> I know the nightly rules promotion script hits the ruleqa site to 
>>> make sure there have been 3 days of successful masschecks so if we 
>>> did add a captcha, that would have to be excluded.
>>>
>>> I have the web logs from the ruleqa site going to their own files 
>>> now with awstats setup to give a quick overview of what is going 
>>> on.  I was planning on letting this ride for a bit and see what kind 
>>> of activity the ruleqa site normally gets. I don't think it gets hit 
>>> much now that the bots are taken care of.
>>>
>>> Also, I installed NRPE on the box and am monitoring it more closely 
>>> via Icinga.  I will have graphs of memory usage and get alerts when 
>>> the memory is being exhausted again. Hopefully if this happens 
>>> again, I can get into the box before OOM killer starts whacking 
>>> processes. 
>> Thanks.  Out of interest is the lack of swap what is completely 
>> killing the box?  I'm used to DDOSes but not why it's spiraling the 
>> whole box.
>
> The lack of swap is not the direct problem but it certainly is making 
> it hard to troubleshoot the actual problem.  I really think it's odd 
> for infra to not setup swap space.  I know they said it was bad for 
> their SAN and I suppose that it would be if it were on SSDs and VMs 
> were constantly into swap.
>
> We normally monitor our VMs so that we get alerts when memory is 
> getting near exhaustion and swap is being used so it doesn't impact 
> our SAN.  At least this gives us some time to get into the box, see 
> what is going on, and restart processes before the whole system is 
> unresponsive.
>
> We build our VMs with swap space and it runs on our our Complellent 
> SAN with SSDs without a problem.  Not sure why infra doesn't.  You 
> just need to make sure to mount everything with the "discard" option 
> to play nice with most SAN's virtual block allocation and freeing.
>
> My Icinga memory graphs are showing the used RAM hovering so far 
> around 6 GB.  It dropped to around 3.0 GB last night when the hourly 
> ruleqa updates weren't happening in cron and the masscheck was 
> running.  We seem to be wasting a lot of RAM in that VM right now 
> after getting Apache HTTPD under control by blocking the bad bots.  We 
> will have more informative graphs after more time has passed. 
Yeah, I was confused about the swap space as well but they give us more 
ram easily.

Let me know in a week or so and we can ask them to lower the ram if you 
want.

Regards,
KAM

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message