Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-dev@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of joshua.taylor@gmail.com
 designates 209.85.200.170 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :cc:content-type;
        b=UVFwwlYRRHYboIB7xuOL8sP0/Z6MepYak5BMSlv7+PFOQ5iPl3Og69piJP50YztmKq
         0bHIifSwkUUoKwNsEzspSVoo4Xyqu/PJP2ioQtvtpetlZHpNAeElU/z2nheUXKdTQxsO
         3mt/ZErHq+ngJuktn3rwbA+OUV+8fk7rUA7eA=
MIME-Version: 1.0
In-Reply-To: <c9690f720904171022n24cabb29neca24120977a7526@mail.gmail.com>
References: <c9690f720904161110u45cdf7d9q5f359462df50ad86@mail.gmail.com>
	 <833586.51976.qm@web65515.mail.ac4.yahoo.com>
	 <c9690f720904171022n24cabb29neca24120977a7526@mail.gmail.com>
Date: Thu, 23 Apr 2009 12:17:00 -0700
Message-ID: <c811d0dc0904231217s25d3c1f4ld22ba22e66ff7e71@mail.gmail.com>
Subject: Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly
	consumes ~100GB of address space
From: Joshua Taylor <joshua.taylor@gmail.com>
To: Sriram Rao <sriramsrao@gmail.com>
Cc: apurtell@apache.org, hbase-dev@hadoop.apache.org,
	kosmosfs-users@lists.sourceforge.net
Content-Type: multipart/alternative; boundary=000e0cd179c47a582b04683db90a

--000e0cd179c47a582b04683db90a
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

I have a bit more info about this problem.  It just happened to one of my
chunkservers as well.  I had replaced all log4cpp calls with fprintf to
avoid random log4cpp crashes (separate problem), so the chunkserver is
already doing debug logging.  Here's the log from the beginning of the
bloat/overload:

04-23-2009 05:42:26.625 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 32636928 numBytes: 131072
04-23-2009 05:42:26.630 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 32768000 numBytes: 131072
04-23-2009 05:42:26.633 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 32899072 numBytes: 131072
04-23-2009 05:42:26.636 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 33030144 numBytes: 131072
04-23-2009 05:42:26.638 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 33161216 numBytes: 131072
04-23-2009 05:42:26.641 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 51823 chunkversion = 1 offset: 33292288 numBytes: 131072
04-23-2009 05:42:26.648 DEBUG - (ClientSM.cc:301) Got command: read: chunkId
= 49677 chunkversion = 1 offset: 16276417 numBytes: 9223372036854775807
04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5001)
outstanding...overloaded
04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5002)
outstanding...overloaded
04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5003)
outstanding...overloaded
04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5004)
outstanding...overloaded
04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5005)
outstanding...overloaded
04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5006)
outstanding...overloaded
04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5007)
outstanding...overloaded
04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5008)
outstanding...overloaded
04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5009)
outstanding...overloaded
04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5010)
outstanding...overloaded
04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5011)
outstanding...overloaded
04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5012)
outstanding...overloaded
04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5013)
outstanding...overloaded
04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5014)
outstanding...overloaded
04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5015)
outstanding...overloaded
04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5016)
outstanding...overloaded

The line where numBytes == 2^63 - 1 is rather suspicious.  I'm not running
Heritrix/Hbase/etc.  This is just a few file copies.  I have 8 chunkservers
total.  Only one is having this problem.  The other chunkservers are still
logging that they're making progress (reading and writing chunks).

Here's a stack trace from the server currently pegged at 100% CPU, 200 GB
VmSize:

Thread 5 (Thread 0x40bd9950 (LWP 3838)):
#0  0x00000030df80af19 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1  0x000000000047f2d7 in KFS::MetaThread::sleep ()
#2  0x000000000047f317 in KFS::MetaQueue<KFS::KfsOp>::dequeue_internal ()
#3  0x000000000047f414 in KFS::MetaQueue<KFS::KfsOp>::dequeue ()
#4  0x000000000047bdf3 in KFS::Logger::MainLoop ()
#5  0x000000000047bf79 in logger_main ()
#6  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
#7  0x00000030dece439d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x415da950 (LWP 3839)):
#0  0x000000000046fe1a in
std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t> >::operator++
()
#1  0x000000000046fe4a in
std::__distance<std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t>
> > ()
#2  0x000000000046fe93 in
std::distance<std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t>
> > ()
#3  0x000000000046fec3 in std::list<boost::shared_ptr<KFS::DiskEvent_t>,
std::allocator<boost::shared_ptr<KFS::DiskEvent_t> > >::size ()
#4  0x000000000048db0a in KFS::DiskManager::IOInitiated ()
#5  0x000000000048e0ee in KFS::DiskManager::Read ()
#6  0x000000000048aeca in KFS::DiskConnection::Read ()
#7  0x0000000000452a74 in KFS::ChunkManager::ReadChunk ()
#8  0x0000000000469d04 in KFS::ReadOp::HandleChunkMetaReadDone ()
#9  0x0000000000458378 in KFS::ObjectMethod<KFS::ReadOp>::execute ()
#10 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent ()
#11 0x0000000000452b9f in KFS::ChunkManager::ReadChunkMetadata ()
#12 0x0000000000467fe1 in KFS::ReadOp::Execute ()
#13 0x0000000000463ad3 in KFS::SubmitOp ()
#14 0x000000000046104e in KFS::ClientSM::HandleClientCmd ()
#15 0x000000000046163c in KFS::ClientSM::HandleRequest ()
#16 0x00000000004628e8 in KFS::ObjectMethod<KFS::ClientSM>::execute ()
#17 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent ()
#18 0x0000000000493a20 in KFS::NetConnection::HandleReadEvent ()
#19 0x00000000004949ca in KFS::NetManager::MainLoop ()
#20 0x000000000045f2c0 in netWorker ()
#21 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
#22 0x00000030dece439d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x415e2950 (LWP 9108)):
#0  0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 ()
#1  0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1
#2  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
#3  0x00000030dece439d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x415ea950 (LWP 9508)):
#0  0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 ()
#1  0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1
#2  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
#3  0x00000030dece439d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fd83a463700 (LWP 3837)):
#0  0x00000030df807b75 in pthread_join () from /lib64/libpthread.so.0
#1  0x000000000045f87e in KFS::MetaThread::join ()
#2  0x000000000045f299 in KFS::ChunkServer::MainLoop ()
#3  0x000000000044e0b8 in main ()

Everything looks blocked except Thread 4, so I'm guessing it's the one going
berzerk.


On Fri, Apr 17, 2009 at 10:22 AM, Sriram Rao <sriramsrao@gmail.com> wrote:

> Hi Andy,
>
> Thanks for the info.  From what you are seeing, it looks like
> something is causing a memory bloat, causing too many IOs and
> eventually everything grinds to a halt.  Since it is easy to
> reproduce, I can give it a try on my machine.  Can you give me the
> inputs to Heritrix and any config info to Hbase?
>
> It'd also be good if you can get me the chunkserver logs.  In the
> Chunkserver.prp file, can you add the following line:
> chunkServer.loglevel = DEBUG
>
> and send me the file/upload it?  The easier thing to do is, file a bug
> about this issue on KFS sourceforge page and then upload the
> chunkserver logs.
>
> Def. want to get to the bottom of this since this is blocking you/Ryan.
>
> Sriram
>
> On Fri, Apr 17, 2009 at 4:16 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
> >
> > Hi Sriram,
> >
> >> Can you tell me the exact steps to repro the problem:
> >>  - What version of Hbase?
> >
> > SVN trunk, 0.20.0-dev
> >
> >>  - Which version of Heritrix?
> >
> > Heritrix 2.0, plus the HBase writer which can be found here:
> > http://code.google.com/p/hbase-writer/
> >
> >> What is happening is that the KFS chunkserver is sending
> >> writes down to disk and they aren't coming back "soon
> >> enough", causing things to backlog; the chunkserver is
> >> printing out the backlog status message.
> >
> > I wonder if this might be a secondary effect. Just before
> > these messages begin streaming into the log, the chunkserver
> > suddenly balloons its address space from ~200KB to ~100GB.
> > These two things have strong correlation and happen in the
> > same order in repeatable manner.
> >
> > Once the backlog messages begin, no further IO completes as
> > far as I can tell. The count of outstanding IOs
> > monotonically increases. Also, the metaserver declares the
> > chunkserver dead.
> >
> > I can take steps to help diagnose the problem. Please advise.
> > Would it help if I replicate the problem again with
> > chunkserver logging at DEBUG and then post the compressed
> > logs somewhere?
> >
> > [...]
> >> On Thu, Apr 16, 2009 at 12:27 AM, Andrew Purtell
> >> >
> >> > Hi,
> >> >
> >> > Like Ryan I have been trying to run HBase on top of
> >> > KFS. In my case I am running a SVN snapshot from
> >> > yesterday. I have a minimal installation of KFS
> >> > metaserver, chunkserver, and HBase master and
> >> > regionserver all running on one test host with 4GB of
> >> > RAM. Of course I do not expect more than minimal
> >> > function. To apply some light load, I run the Heritrix
> >> > crawler with 5 TOE threads which write on average
> >> > 200 Kbit/sec of data into HBase, which flushes this
> >> > incoming data in ~64MB increments and also runs
> >> > occasional compaction cycles where the 64MB flush
> >> > files will be compacted into ~256MB files.
> >> >
> >> > I find that for no obvious reason the chunkserver will
> >> > suddenly grab ~100 GIGAbytes of address space and emit
> >> > a steady stream of "(DiskManager.cc:392) Too many disk
> >> > IOs (N)" to the log at INFO level, where N is a
> >> > steadily increasing number. The host is under moderate
> >> > load at the time -- KFS is busy -- but is not in swap
> >> > and according to atop has some disk I/O and network
> >> > bandwidth to spare.
> > [...]
> >
> >
> >
> >
> >
>
>
> ------------------------------------------------------------------------------
> Stay on top of everything new and different, both inside and
> around Java (TM) technology - register by April 22, and save
> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
> 300 plus technical and hands-on sessions. Register today.
> Use priority code J9JMT32. http://p.sf.net/sfu/p
> _______________________________________________
> Kosmosfs-users mailing list
> Kosmosfs-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kosmosfs-users
>

--000e0cd179c47a582b04683db90a--