hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: What's the region server doing?
Date Wed, 02 Mar 2011 01:48:17 GMT
next() is the call to get the next row from a scan.

Maybe you aren't looking at the right region server? If you'd like to
speed up this debugging session, feel free to drop by the #hbase
channel on freenode, then we could report the results on the mailing


On Tue, Mar 1, 2011 at 5:43 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
> And what's "next?" .... and what's next?
> On Tue, Mar 1, 2011 at 5:41 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>> I just took the stack track of both master and the meta RS. the
>> master's still waiting for that thread which called "next", but no IPC
>> Server handler on the RS has that call. Is that possible? Or have I
>> just stared at this thing for too long?
>> On Tue, Mar 1, 2011 at 5:32 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
>>> Yes, and on the other side (which is the region server that hosts
>>> .META.) you should be able to see that call. Well, not that specific
>>> one, but one of them :)
>>> J-D
>>> On Tue, Mar 1, 2011 at 5:30 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>>>> You said "next", I don't know if this related at all but from the
>>>> master's thread dump, it says the disable is blocked by this thread
>>>> below, and it calling next:
>>>> Thread 27 (RegionManager.metaScanner):
>>>>  State: WAITING
>>>>  Blocked count: 69503
>>>>  Waited count: 68805
>>>>  Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@42fcac6
>>>>  Stack:
>>>>    java.lang.Object.wait(Native Method)
>>>>    java.lang.Object.wait(Object.java:485)
>>>>    org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:722)
>>>>    org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
>>>>    $Proxy1.next(Unknown Source)
>>>>    org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:179)
>>>>    org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
>>>>    org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
>>>>    org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:153)
>>>>    org.apache.hadoop.hbase.Chore.run(Chore.java:68)
>>>> On Tue, Mar 1, 2011 at 5:22 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>>>>> Thanks man I'll try that and post back when I find something. BTW, I
>>>>> ran the script to set the memstore flush size on .META., now I am
>>>>> seeing a lot less writing to HDFS from the .META RS and less
>>>>> compaction, unfortunately it's still low. :(
>>>>> On Tue, Mar 1, 2011 at 5:15 PM, Jean-Daniel Cryans <jdcryans@apache.org>
>>>>>> In that specific jstack it's doing nothing at all, but keep in mind
>>>>>> that it's only a snapshot of a precise moment in time. Try jstack'ing
>>>>>> a few times and at some point you should see the threads named like
>>>>>> "IPC Server handler xx on 60020" (where xx is a number) showing bigger
>>>>>> stack traces with HRegionServer doing stuff like get, next, put,
>>>>>> You should also try scanning '.META.' from the shell and if it's
>>>>>> do the jstack'ing at the same time.
>>>>>> J-D
>>>>>> On Tue, Mar 1, 2011 at 5:07 PM, Nanheng Wu <nanhengwu@gmail.com>
>>>>>>> My cluster (10 nodes, hbase-0.20.6 + hadoop 0.20.2) is very very
>>>>>>> for any operation like disable table or delete. Master's thread
>>>>>>> says they are blocked by the metaScanner thread. When I looked
at the
>>>>>>> log file on the .META RS there are no outputs at all! (INFO debug
>>>>>>> level). J-D has been helping me on this, we pretty much figured
>>>>>>> that RegionManager.metaScanner is the culprit, because it's taking
>>>>>>> around 25 minutes to scan 8K rows. What I don't get is what the
>>>>>>> server is actually doing during this time. There's no request
at all
>>>>>>> on the cluster, no RS splits either because we just use a MR
job to
>>>>>>> output HFiles and never write again.
>>>>>>> J-D has been really really helpful, but I feel like I took too
much of
>>>>>>> his time. Below is the thread dump of the .META RS during the
>>>>>>> when disables command are blocked on meta scanner, can someone
help me
>>>>>>> figure out what the server is doing, is it running any thread
at all?
>>>>>>> Thank you!
>>>>>>> http://pastebin.com/CZQAywq3

View raw message