Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 309 invoked from network); 23 Apr 2009 19:45:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Apr 2009 19:45:23 -0000 Received: (qmail 92793 invoked by uid 500); 23 Apr 2009 19:45:23 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 92756 invoked by uid 500); 23 Apr 2009 19:45:23 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 92746 invoked by uid 99); 23 Apr 2009 19:45:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2009 19:45:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sriramsrao@gmail.com designates 74.125.44.30 as permitted sender) Received: from [74.125.44.30] (HELO yx-out-2324.google.com) (74.125.44.30) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2009 19:45:17 +0000 Received: by yx-out-2324.google.com with SMTP id 8so424851yxm.29 for ; Thu, 23 Apr 2009 12:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=lR1FT3kVuK9jZBfiZ2QZUydvUeG/WpmE02glqTiRSDU=; b=beMWDyi54Th5e4PAvIerDGDBqIqZM8ipq5r2imkMvG3Jk7mHTuTztYcgoedzrdR9zs Ktrx6Y7rpVtawOA8mn2ZdnUNQpk9uYAliueu84ZD7gnaflC/61QDR4ZxTP19qVXHQgls VZiFpFD4u65T0YwptQqUMr4uJ96tsEJU9wuPQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=WZMF452nRSLUSDN9qBNhuPvMLKcbKyWrLltU7z4dW0+k3NtzWmkwEj7WiXF7ocad2n U1T1DvAfwsynDebL+qc4MFGj6bkCch/iAw0F4fmL9WuGbWgOMAiPGAOn/qd/CWoc5Mww JV1NzrkAingG1zrisiVpvHvnPkV6E/ap/MOUk= MIME-Version: 1.0 Received: by 10.100.231.16 with SMTP id d16mr1937851anh.111.1240515896478; Thu, 23 Apr 2009 12:44:56 -0700 (PDT) In-Reply-To: References: <833586.51976.qm@web65515.mail.ac4.yahoo.com> Date: Thu, 23 Apr 2009 12:44:56 -0700 Message-ID: Subject: Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly consumes ~100GB of address space From: Sriram Rao To: Joshua Taylor Cc: apurtell@apache.org, hbase-dev@hadoop.apache.org, kosmosfs-users@lists.sourceforge.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I have checked in a fix to the chunkserver to fail the read when it sees bogus values (revision 316). That was a bug in the code :-( Would still be good to know what caused the client to send bogus values for the size... Sriram On Thu, Apr 23, 2009 at 12:32 PM, Sriram Rao wrote: > Josh, > > Thanks. =A0Josh Adams had also seen the same bloat and from his logs, I > had figured that the chunkserver was in loop trying to read data and > we couldn't quite figure out what was causing it. =A0This looks like the > smoking gun :-) > > What was the test doing? =A0I am still curious why the client requested > a bizarre amount of data. =A0I'll put in the fixes to fail the read. > > Sriram > > On Thu, Apr 23, 2009 at 12:17 PM, Joshua Taylor = wrote: >> I have a bit more info about this problem.=A0 It just happened to one of= my >> chunkservers as well.=A0 I had replaced all log4cpp calls with fprintf t= o >> avoid random log4cpp crashes (separate problem), so the chunkserver is >> already doing debug logging.=A0 Here's the log from the beginning of the >> bloat/overload: >> >> 04-23-2009 05:42:26.625 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 32636928 numBytes: 131072 >> 04-23-2009 05:42:26.630 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 32768000 numBytes: 131072 >> 04-23-2009 05:42:26.633 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 32899072 numBytes: 131072 >> 04-23-2009 05:42:26.636 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 33030144 numBytes: 131072 >> 04-23-2009 05:42:26.638 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 33161216 numBytes: 131072 >> 04-23-2009 05:42:26.641 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 51823 chunkversion =3D 1 offset: 33292288 numBytes: 131072 >> 04-23-2009 05:42:26.648 DEBUG - (ClientSM.cc:301) Got command: read: chu= nkId >> =3D 49677 chunkversion =3D 1 offset: 16276417 numBytes: 9223372036854775= 807 >> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5= 001) >> outstanding...overloaded >> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5= 002) >> outstanding...overloaded >> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5= 003) >> outstanding...overloaded >> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs (5= 004) >> outstanding...overloaded >> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5= 005) >> outstanding...overloaded >> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5= 006) >> outstanding...overloaded >> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5= 007) >> outstanding...overloaded >> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs (5= 008) >> outstanding...overloaded >> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5= 009) >> outstanding...overloaded >> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5= 010) >> outstanding...overloaded >> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5= 011) >> outstanding...overloaded >> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs (5= 012) >> outstanding...overloaded >> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5= 013) >> outstanding...overloaded >> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5= 014) >> outstanding...overloaded >> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5= 015) >> outstanding...overloaded >> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs (5= 016) >> outstanding...overloaded >> >> The line where numBytes =3D=3D 2^63 - 1 is rather suspicious.=A0 I'm not= running >> Heritrix/Hbase/etc.=A0 This is just a few file copies.=A0 I have 8 chunk= servers >> total.=A0 Only one is having this problem.=A0 The other chunkservers are= still >> logging that they're making progress (reading and writing chunks). >> >> Here's a stack trace from the server currently pegged at 100% CPU, 200 G= B >> VmSize: >> >> Thread 5 (Thread 0x40bd9950 (LWP 3838)): >> #0=A0 0x00000030df80af19 in pthread_cond_wait@@GLIBC_2.3.2 () >> #1=A0 0x000000000047f2d7 in KFS::MetaThread::sleep () >> #2=A0 0x000000000047f317 in KFS::MetaQueue::dequeue_internal= () >> #3=A0 0x000000000047f414 in KFS::MetaQueue::dequeue () >> #4=A0 0x000000000047bdf3 in KFS::Logger::MainLoop () >> #5=A0 0x000000000047bf79 in logger_main () >> #6=A0 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0 >> #7=A0 0x00000030dece439d in clone () from /lib64/libc.so.6 >> Thread 4 (Thread 0x415da950 (LWP 3839)): >> #0=A0 0x000000000046fe1a in >> std::_List_const_iterator >::operato= r++ >> () >> #1=A0 0x000000000046fe4a in >> std::__distance >>> > () >> #2=A0 0x000000000046fe93 in >> std::distance >>> > () >> #3=A0 0x000000000046fec3 in std::list, >> std::allocator > >::size () >> #4=A0 0x000000000048db0a in KFS::DiskManager::IOInitiated () >> #5=A0 0x000000000048e0ee in KFS::DiskManager::Read () >> #6=A0 0x000000000048aeca in KFS::DiskConnection::Read () >> #7=A0 0x0000000000452a74 in KFS::ChunkManager::ReadChunk () >> #8=A0 0x0000000000469d04 in KFS::ReadOp::HandleChunkMetaReadDone () >> #9=A0 0x0000000000458378 in KFS::ObjectMethod::execute () >> #10 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent () >> #11 0x0000000000452b9f in KFS::ChunkManager::ReadChunkMetadata () >> #12 0x0000000000467fe1 in KFS::ReadOp::Execute () >> #13 0x0000000000463ad3 in KFS::SubmitOp () >> #14 0x000000000046104e in KFS::ClientSM::HandleClientCmd () >> #15 0x000000000046163c in KFS::ClientSM::HandleRequest () >> #16 0x00000000004628e8 in KFS::ObjectMethod::execute () >> #17 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent () >> #18 0x0000000000493a20 in KFS::NetConnection::HandleReadEvent () >> #19 0x00000000004949ca in KFS::NetManager::MainLoop () >> #20 0x000000000045f2c0 in netWorker () >> #21 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0 >> #22 0x00000030dece439d in clone () from /lib64/libc.so.6 >> Thread 3 (Thread 0x415e2950 (LWP 9108)): >> #0=A0 0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 () >> #1=A0 0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1 >> #2=A0 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0 >> #3=A0 0x00000030dece439d in clone () from /lib64/libc.so.6 >> Thread 2 (Thread 0x415ea950 (LWP 9508)): >> #0=A0 0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 () >> #1=A0 0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1 >> #2=A0 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0 >> #3=A0 0x00000030dece439d in clone () from /lib64/libc.so.6 >> Thread 1 (Thread 0x7fd83a463700 (LWP 3837)): >> #0=A0 0x00000030df807b75 in pthread_join () from /lib64/libpthread.so.0 >> #1=A0 0x000000000045f87e in KFS::MetaThread::join () >> #2=A0 0x000000000045f299 in KFS::ChunkServer::MainLoop () >> #3=A0 0x000000000044e0b8 in main () >> >> Everything looks blocked except Thread 4, so I'm guessing it's the one g= oing >> berzerk. >> >> >> On Fri, Apr 17, 2009 at 10:22 AM, Sriram Rao wrot= e: >>> >>> Hi Andy, >>> >>> Thanks for the info. =A0From what you are seeing, it looks like >>> something is causing a memory bloat, causing too many IOs and >>> eventually everything grinds to a halt. =A0Since it is easy to >>> reproduce, I can give it a try on my machine. =A0Can you give me the >>> inputs to Heritrix and any config info to Hbase? >>> >>> It'd also be good if you can get me the chunkserver logs. =A0In the >>> Chunkserver.prp file, can you add the following line: >>> chunkServer.loglevel =3D DEBUG >>> >>> and send me the file/upload it? =A0The easier thing to do is, file a bu= g >>> about this issue on KFS sourceforge page and then upload the >>> chunkserver logs. >>> >>> Def. want to get to the bottom of this since this is blocking you/Ryan. >>> >>> Sriram >>> >>> On Fri, Apr 17, 2009 at 4:16 AM, Andrew Purtell >>> wrote: >>> > >>> > Hi Sriram, >>> > >>> >> Can you tell me the exact steps to repro the problem: >>> >> =A0- What version of Hbase? >>> > >>> > SVN trunk, 0.20.0-dev >>> > >>> >> =A0- Which version of Heritrix? >>> > >>> > Heritrix 2.0, plus the HBase writer which can be found here: >>> > http://code.google.com/p/hbase-writer/ >>> > >>> >> What is happening is that the KFS chunkserver is sending >>> >> writes down to disk and they aren't coming back "soon >>> >> enough", causing things to backlog; the chunkserver is >>> >> printing out the backlog status message. >>> > >>> > I wonder if this might be a secondary effect. Just before >>> > these messages begin streaming into the log, the chunkserver >>> > suddenly balloons its address space from ~200KB to ~100GB. >>> > These two things have strong correlation and happen in the >>> > same order in repeatable manner. >>> > >>> > Once the backlog messages begin, no further IO completes as >>> > far as I can tell. The count of outstanding IOs >>> > monotonically increases. Also, the metaserver declares the >>> > chunkserver dead. >>> > >>> > I can take steps to help diagnose the problem. Please advise. >>> > Would it help if I replicate the problem again with >>> > chunkserver logging at DEBUG and then post the compressed >>> > logs somewhere? >>> > >>> > [...] >>> >> On Thu, Apr 16, 2009 at 12:27 AM, Andrew Purtell >>> >> > >>> >> > Hi, >>> >> > >>> >> > Like Ryan I have been trying to run HBase on top of >>> >> > KFS. In my case I am running a SVN snapshot from >>> >> > yesterday. I have a minimal installation of KFS >>> >> > metaserver, chunkserver, and HBase master and >>> >> > regionserver all running on one test host with 4GB of >>> >> > RAM. Of course I do not expect more than minimal >>> >> > function. To apply some light load, I run the Heritrix >>> >> > crawler with 5 TOE threads which write on average >>> >> > 200 Kbit/sec of data into HBase, which flushes this >>> >> > incoming data in ~64MB increments and also runs >>> >> > occasional compaction cycles where the 64MB flush >>> >> > files will be compacted into ~256MB files. >>> >> > >>> >> > I find that for no obvious reason the chunkserver will >>> >> > suddenly grab ~100 GIGAbytes of address space and emit >>> >> > a steady stream of "(DiskManager.cc:392) Too many disk >>> >> > IOs (N)" to the log at INFO level, where N is a >>> >> > steadily increasing number. The host is under moderate >>> >> > load at the time -- KFS is busy -- but is not in swap >>> >> > and according to atop has some disk I/O and network >>> >> > bandwidth to spare. >>> > [...] >>> > >>> > >>> > >>> > >>> > >>> >>> >>> -----------------------------------------------------------------------= ------- >>> Stay on top of everything new and different, both inside and >>> around Java (TM) technology - register by April 22, and save >>> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. >>> 300 plus technical and hands-on sessions. Register today. >>> Use priority code J9JMT32. http://p.sf.net/sfu/p >>> _______________________________________________ >>> Kosmosfs-users mailing list >>> Kosmosfs-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/kosmosfs-users >> >> >