hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Fox <r...@connexity.com>
Subject Re: Slow region moves
Date Thu, 15 Oct 2015 18:11:15 GMT

"StoreFileCloserThread-L-1" prio=10 tid=0x00000000027ec800 nid=0xad84 runnable [0x00007fbcc0c65000]
   java.lang.Thread.State: RUNNABLE
        at java.util.LinkedList.indexOf(LinkedList.java:602)
        at java.util.LinkedList.contains(LinkedList.java:315)
        at org.apache.hadoop.hbase.io.hfile.bucket.BucketAllocator$BucketSizeInfo.freeBlock(BucketAllocator.java:247)
        at org.apache.hadoop.hbase.io.hfile.bucket.BucketAllocator.freeBlock(BucketAllocator.java:449)
        - locked <0x000000041b0887a8> (a org.apache.hadoop.hbase.io.hfile.bucket.BucketAllocator)
        at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.evictBlock(BucketCache.java:459)
        at org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.evictBlocksByHfileName(BucketCache.java:1036)
        at org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.evictBlocksByHfileName(CombinedBlockCache.java:90)
        at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.close(HFileReaderV2.java:516)
        at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1143)
        at org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:503)
        - locked <0x00000004944ff2d8> (a org.apache.hadoop.hbase.regionserver.StoreFile)
        at org.apache.hadoop.hbase.regionserver.HStore$2.call(HStore.java:873)
        at org.apache.hadoop.hbase.regionserver.HStore$2.call(HStore.java:870)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"StoreCloserThread-Wildfire_graph3,\x00\x04lK\x1B\xFC\x10\xD2,1402949830657.afb6a1720d936a83d73022aeb9ddbb6c.-1"
prio=10 tid=0x0000000003508800 nid=0xad83 waiting on condition [0x00007fbcc5dcc000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000534e90a80> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
        at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:883)
        at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:126)
        at org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1378)
        at org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1375)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


"RS_CLOSE_REGION-hb20:60020-0" prio=10 tid=0x00007fcec0142000 nid=0x3056 waiting on condition
[0x00007fbcc2d87000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000534e61360> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
        at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1385)
        at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1280)
        - locked <0x000000042230fa68> (a java.lang.Object)
        at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:138)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


I attached the whole thing as well.

-r


On 10/15/15, 10:39 AM, "Ted Yu" <yuzhihong@gmail.com> wrote:

>Can you give a bit more detail on why block eviction was cause for the slow region movement?
>
>Did you happen to take stack traces ?
>
>Thanks
>
>> On Oct 15, 2015, at 10:32 AM, Randy Fox <rfox@connexity.com> wrote:
>> 
>> Hi,
>> 
>> We just upgraded from 0.94 to 1.0.0 and have noticed that region moves are super
slow (order of minutes) whereas previously they where in the seconds range.  After looking
at the code, I think the time is spent waiting for the blocks to be evicted from block cache.
>> 
>> I wanted to verify that this theory is correct and see if there is anything that
can be done to speed up the moves.
>> 
>> This is particular painful as we are trying to get our configs tuned to the new SW
and need to do rolling restarts which is taking almost 24 hours on our cluster.  We also do
our own manual rebalancing of regions across RS’s and that task is also now painful.
>> 
>> 
>> Thanks,
>> 
>> Randy
Mime
View raw message