hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nanheng Wu <nanhen...@gmail.com>
Subject Re: What's the region server doing?
Date Wed, 02 Mar 2011 01:41:20 GMT
I just took the stack track of both master and the meta RS. the
master's still waiting for that thread which called "next", but no IPC
Server handler on the RS has that call. Is that possible? Or have I
just stared at this thing for too long?

On Tue, Mar 1, 2011 at 5:32 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
> Yes, and on the other side (which is the region server that hosts
> .META.) you should be able to see that call. Well, not that specific
> one, but one of them :)
> J-D
> On Tue, Mar 1, 2011 at 5:30 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>> You said "next", I don't know if this related at all but from the
>> master's thread dump, it says the disable is blocked by this thread
>> below, and it calling next:
>> Thread 27 (RegionManager.metaScanner):
>>  State: WAITING
>>  Blocked count: 69503
>>  Waited count: 68805
>>  Waiting on org.apache.hadoop.hbase.ipc.HBaseClient$Call@42fcac6
>>  Stack:
>>    java.lang.Object.wait(Native Method)
>>    java.lang.Object.wait(Object.java:485)
>>    org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:722)
>>    org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
>>    $Proxy1.next(Unknown Source)
>>    org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:179)
>>    org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
>>    org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
>>    org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:153)
>>    org.apache.hadoop.hbase.Chore.run(Chore.java:68)
>> On Tue, Mar 1, 2011 at 5:22 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>>> Thanks man I'll try that and post back when I find something. BTW, I
>>> ran the script to set the memstore flush size on .META., now I am
>>> seeing a lot less writing to HDFS from the .META RS and less
>>> compaction, unfortunately it's still low. :(
>>> On Tue, Mar 1, 2011 at 5:15 PM, Jean-Daniel Cryans <jdcryans@apache.org>
>>>> In that specific jstack it's doing nothing at all, but keep in mind
>>>> that it's only a snapshot of a precise moment in time. Try jstack'ing
>>>> a few times and at some point you should see the threads named like
>>>> "IPC Server handler xx on 60020" (where xx is a number) showing bigger
>>>> stack traces with HRegionServer doing stuff like get, next, put, etc
>>>> You should also try scanning '.META.' from the shell and if it's slow,
>>>> do the jstack'ing at the same time.
>>>> J-D
>>>> On Tue, Mar 1, 2011 at 5:07 PM, Nanheng Wu <nanhengwu@gmail.com> wrote:
>>>>> My cluster (10 nodes, hbase-0.20.6 + hadoop 0.20.2) is very very slow
>>>>> for any operation like disable table or delete. Master's thread dump
>>>>> says they are blocked by the metaScanner thread. When I looked at the
>>>>> log file on the .META RS there are no outputs at all! (INFO debug
>>>>> level). J-D has been helping me on this, we pretty much figured out
>>>>> that RegionManager.metaScanner is the culprit, because it's taking
>>>>> around 25 minutes to scan 8K rows. What I don't get is what the region
>>>>> server is actually doing during this time. There's no request at all
>>>>> on the cluster, no RS splits either because we just use a MR job to
>>>>> output HFiles and never write again.
>>>>> J-D has been really really helpful, but I feel like I took too much of
>>>>> his time. Below is the thread dump of the .META RS during the time
>>>>> when disables command are blocked on meta scanner, can someone help me
>>>>> figure out what the server is doing, is it running any thread at all?
>>>>> Thank you!
>>>>> http://pastebin.com/CZQAywq3

View raw message