hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sriram Rao <srirams...@gmail.com>
Subject Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly consumes ~100GB of address space
Date Fri, 17 Apr 2009 17:22:44 GMT
Hi Andy,

Thanks for the info.  From what you are seeing, it looks like
something is causing a memory bloat, causing too many IOs and
eventually everything grinds to a halt.  Since it is easy to
reproduce, I can give it a try on my machine.  Can you give me the
inputs to Heritrix and any config info to Hbase?

It'd also be good if you can get me the chunkserver logs.  In the
Chunkserver.prp file, can you add the following line:
chunkServer.loglevel = DEBUG

and send me the file/upload it?  The easier thing to do is, file a bug
about this issue on KFS sourceforge page and then upload the
chunkserver logs.

Def. want to get to the bottom of this since this is blocking you/Ryan.

Sriram

On Fri, Apr 17, 2009 at 4:16 AM, Andrew Purtell <apurtell@apache.org> wrote:
>
> Hi Sriram,
>
>> Can you tell me the exact steps to repro the problem:
>>  - What version of Hbase?
>
> SVN trunk, 0.20.0-dev
>
>>  - Which version of Heritrix?
>
> Heritrix 2.0, plus the HBase writer which can be found here:
> http://code.google.com/p/hbase-writer/
>
>> What is happening is that the KFS chunkserver is sending
>> writes down to disk and they aren't coming back "soon
>> enough", causing things to backlog; the chunkserver is
>> printing out the backlog status message.
>
> I wonder if this might be a secondary effect. Just before
> these messages begin streaming into the log, the chunkserver
> suddenly balloons its address space from ~200KB to ~100GB.
> These two things have strong correlation and happen in the
> same order in repeatable manner.
>
> Once the backlog messages begin, no further IO completes as
> far as I can tell. The count of outstanding IOs
> monotonically increases. Also, the metaserver declares the
> chunkserver dead.
>
> I can take steps to help diagnose the problem. Please advise.
> Would it help if I replicate the problem again with
> chunkserver logging at DEBUG and then post the compressed
> logs somewhere?
>
> [...]
>> On Thu, Apr 16, 2009 at 12:27 AM, Andrew Purtell
>> >
>> > Hi,
>> >
>> > Like Ryan I have been trying to run HBase on top of
>> > KFS. In my case I am running a SVN snapshot from
>> > yesterday. I have a minimal installation of KFS
>> > metaserver, chunkserver, and HBase master and
>> > regionserver all running on one test host with 4GB of
>> > RAM. Of course I do not expect more than minimal
>> > function. To apply some light load, I run the Heritrix
>> > crawler with 5 TOE threads which write on average
>> > 200 Kbit/sec of data into HBase, which flushes this
>> > incoming data in ~64MB increments and also runs
>> > occasional compaction cycles where the 64MB flush
>> > files will be compacted into ~256MB files.
>> >
>> > I find that for no obvious reason the chunkserver will
>> > suddenly grab ~100 GIGAbytes of address space and emit
>> > a steady stream of "(DiskManager.cc:392) Too many disk
>> > IOs (N)" to the log at INFO level, where N is a
>> > steadily increasing number. The host is under moderate
>> > load at the time -- KFS is busy -- but is not in swap
>> > and according to atop has some disk I/O and network
>> > bandwidth to spare.
> [...]
>
>
>
>
>

Mime
View raw message