From Andrew Purtell <apurt...@apache.org>
Subject Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly consumes ~100GB of address space
Date Fri, 17 Apr 2009 11:16:53 GMT

Hi Sriram,

> Can you tell me the exact steps to repro the problem:
>  - What version of Hbase?

SVN trunk, 0.20.0-dev

>  - Which version of Heritrix?

Heritrix 2.0, plus the HBase writer which can be found here:
> What is happening is that the KFS chunkserver is sending
> writes down to disk and they aren't coming back "soon
> enough", causing things to backlog; the chunkserver is
> printing out the backlog status message.

I wonder if this might be a secondary effect. Just before
these messages begin streaming into the log, the chunkserver 
suddenly balloons its address space from ~200KB to ~100GB.
These two things have strong correlation and happen in the
same order in repeatable manner.

Once the backlog messages begin, no further IO completes as
far as I can tell. The count of outstanding IOs 
monotonically increases. Also, the metaserver declares the
chunkserver dead.

I can take steps to help diagnose the problem. Please advise.
Would it help if I replicate the problem again with 
chunkserver logging at DEBUG and then post the compressed
logs somewhere?

