httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Tice <mjt...@gmail.com>
Subject Re: [users@httpd] Re: High load using memcache and 9G tmpfs
Date Mon, 20 Jul 2009 13:41:40 GMT
On Mon, Jul 20, 2009 at 7:00 AM, Nicholas Sherlock <n.sherlock@gmail.com>wrote:

> Matthew Tice wrote:
>
>> Currently we're migrating our static node cluster from 32bit OpenSuse 10.3
>> using the disk_cache_module on a 2G tmpfs to a 64bit CentOS 5.3 using the
>> disk_cache module on a 9G tmpfs.  After pushing these CentOS nodes into
>> production (and consequently adding many more requests) we started seeing a
>> load spike on these systems.  Preliminary tests have shown that using a 2G
>> (maybe 3G - still testing that one) tmpfs on the same CentOS node doesn't
>> have the same high load.  I'm not sure if this is a bug with tmpfs,
>> Apache/disk_cache, CentOS, or what.  Any insight into this strange problem
>> would be appreciated.
>>
>
> I had this problem on my server where the system service "mlocate" was
> scheduled to run every day. It basically scans every file on the system, and
> with the huge numbers of files generated by disk_cache, it took more than a
> day to finish one scan. So the next day, there were two running mlocate
> instances. Then three. Then no legitimate IO requests were being serviced
> and the whole server ground to a halt. The load average skyrocketed because
> of all the waiting processes. "mlocate" didn't show up on 'top' because it
> used almost no CPU time. I diagnosed the problem with 'iotop' - it gives
> per-process IO stats.
>
> This is probably not the same problem you're having, but iotop is still a
> useful tool to identify IO competition when you can't find the culprit based
> on CPU-time.
>
> Cheers,
> Nicholas Sherlock
>
> Thanks Nicholas, I'll take a look at that.  I had htcacheclean running
every 5 min. which could have caused a bulk of my problems.  I changed the
daemon to kick off every 30 min. instead which seems to have helped - a
little.  The machine isn't quite as sluggish but the load is still hovering
around 2 (5 min. average).

Matt

Mime
View raw message