hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15900) RS stuck in get lock of HStore
Date Tue, 31 May 2016 08:43:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307450#comment-15307450
] 

Heng Chen commented on HBASE-15900:
-----------------------------------

We could see in 0d32a6bab354e6cc170cd59a2d485797.rs.log,  the region could not be closed due
to there is one compaction left
{code}
2016-05-31 14:18:18,414 INFO  [PriorityRpcServer.handler=19,queue=1,port=16020] regionserver.RSRpcServices:
Close 0d32a6bab354e6cc170cd59a2d485797, moving to dx-pipe-regionserver8-online,16020,1464675204532
2016-05-31 14:18:18,443 DEBUG [RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] handler.CloseRegionHandler:
Processing close of frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
2016-05-31 14:18:18,443 DEBUG [RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] regionserver.HRegion:
Closing frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.:
disabling compactions & flushes
2016-05-31 14:18:18,443 DEBUG [RS_CLOSE_REGION-dx-pipe-regionserver20-online:16020-1] regionserver.HRegion:
waiting for 1 compactions to complete for region frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
{code}

The left compaction is one  major compaction run in longCompaction thread,  the progress of
it reached 81.23%, then it has no response after that.
{code}
2016-05-31 05:28:29,476 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
regionserver.HStore: 0d32a6bab354e6cc170cd59a2d485797 - info: Initiating major compaction
(all files)
2016-05-31 05:28:29,476 INFO  [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
regionserver.HRegion: Starting compaction on info in region frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
2016-05-31 05:28:29,476 INFO  [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
regionserver.HStore: Starting compaction of 9 file(s) in info of frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.
into tmpdir=hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/.tmp,
totalSize=5.7 G
2016-05-31 05:28:29,476 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/c31e84594c444d3cb8de00029e24aa34,
keycount=78408609, bloomtype=ROW, size=4.9 G, encoding=NONE, seqNum=77065948, earliestPutTs=1453899210588
2016-05-31 05:28:29,476 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/d3310d547e494e7db2b2e952e5fb412e,
keycount=6356215, bloomtype=ROW, size=428.0 M, encoding=NONE, seqNum=78270525, earliestPutTs=1463613139775
2016-05-31 05:28:29,477 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/8b11d48ac6a241f1b3ace1a822f087b0,
keycount=1850675, bloomtype=ROW, size=124.6 M, encoding=NONE, seqNum=78624661, earliestPutTs=1464048401491
2016-05-31 05:28:29,477 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/514ea98b95eb4b5fa4e122ebdd32d004,
keycount=443135, bloomtype=ROW, size=29.9 M, encoding=NONE, seqNum=78710458, earliestPutTs=1464306574727
2016-05-31 05:28:29,478 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/bf4877dbdf804492878750c30f3cfd8e,
keycount=780345, bloomtype=ROW, size=52.6 M, encoding=NONE, seqNum=78852324, earliestPutTs=1464307875726
2016-05-31 05:28:29,478 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/79ae0770fb58484e866c7b48f84e16a4,
keycount=357040, bloomtype=ROW, size=24.1 M, encoding=NONE, seqNum=78916832, earliestPutTs=1464391849632
2016-05-31 05:28:29,478 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/e7708a72734448c5a86305a0224af683,
keycount=1725115, bloomtype=ROW, size=116.1 M, encoding=NONE, seqNum=79190322, earliestPutTs=1464393070342
2016-05-31 05:28:29,479 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/1dcd54598541492bb20c279716488b5a,
keycount=869015, bloomtype=ROW, size=58.6 M, encoding=NONE, seqNum=79334888, earliestPutTs=1464588026997
2016-05-31 05:28:29,479 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compacting hdfs://f04/hbase/data/default/frog_stastic/0d32a6bab354e6cc170cd59a2d485797/info/0a76eb5184dd4d42bdd16a588d295d0e,
keycount=53330, bloomtype=ROW, size=3.7 M, encoding=NONE, seqNum=79346806, earliestPutTs=1464591201410
2016-05-31 05:29:29,684 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
14047066/90843479 (15.46%), rate=14910.65 kB/sec, throughputController is NoLimitCompactionThroughputController
2016-05-31 05:30:29,684 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
23487276/90843479 (25.85%), rate=10021.16 kB/sec, throughputController is NoLimitCompactionThroughputController
2016-05-31 05:31:29,684 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
39884696/90843479 (43.90%), rate=17406.01 kB/sec, throughputController is NoLimitCompactionThroughputController
2016-05-31 05:32:29,686 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
45639621/90843479 (50.24%), rate=6103.74 kB/sec, throughputController is NoLimitCompactionThroughputController
2016-05-31 05:33:30,012 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
61288969/90843479 (67.47%), rate=16531.36 kB/sec, throughputController is NoLimitCompactionThroughputController
2016-05-31 05:34:30,225 DEBUG [regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827]
compactions.Compactor: Compaction progress: frog_stastic,$+\xD6\x05_{211}_1454335203,1462760599763.0d32a6bab354e6cc170cd59a2d485797.#info
73794589/90843479 (81.23%), rate=13231.85 kB/sec, throughputController is NoLimitCompactionThroughputController
{code}

And in jstack,  we could see the longCompaction thread is in RUNNABLE state, but it seems
stuck.
{code}
"regionserver/dx-pipe-regionserver20-online/10.11.52.80:16020-longCompactions-1464599181827"
#727 daemon prio=5 os_prio=0 tid=0x00007f5324412000 nid=0x3796 runnable [0x00007f52fc5a8000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
	- locked <0x00000006b5900d70> (a sun.nio.ch.Util$2)
	- locked <0x00000006b58fa2a8> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000006b58d4400> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
	at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.readChannelFully(PacketReceiver.java:258)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:209)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171)
	at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:102)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:186)
	at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)
	- locked <0x00000006b597e8f8> (a org.apache.hadoop.hdfs.RemoteBlockReader2)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:686)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:742)
	- eliminated <0x000000062d8aeed8> (a org.apache.hadoop.hdfs.DFSInputStream)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:799)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840)
	- locked <0x000000062d8aeed8> (a org.apache.hadoop.hdfs.DFSInputStream)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678)
	at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372)
	at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591)
	at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:708)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:833)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:828)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:845)
	at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:865)
	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:139)
	at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
	at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:596)
	at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:255)
	at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106)
	at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112)
	at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1212)
	at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1798)
	at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:519)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
	- <0x000000062afa5528> (a java.util.concurrent.ThreadPoolExecutor$Worker)
	- <0x000000062f7ee850> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	- <0x00000006b5985ec8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
{code}





> RS stuck in get lock of HStore
> ------------------------------
>
>                 Key: HBASE-15900
>                 URL: https://issues.apache.org/jira/browse/HBASE-15900
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.1, 1.3.0
>            Reporter: Heng Chen
>         Attachments: 0d32a6bab354e6cc170cd59a2d485797.jstack.txt, 0d32a6bab354e6cc170cd59a2d485797.rs.log,
9fe15a52_9fe15a52_save, c91324eb_81194e359707acadee2906ffe36ab130.log, dump.txt
>
>
> It happens on my production cluster when i run MR job.  I save the dump.txt from this
RS webUI.
> Many threads stuck here:
> {code}
> Thread 133 (B.defaultRpcServer.handler=94,queue=4,port=16020):
>    32   State: WAITING
>    31   Blocked count: 477816
>    30   Waited count: 535255
>    29   Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6447ba67
>    28   Stack:
>    27     sun.misc.Unsafe.park(Native Method)
>    26     java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>    25     java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>    24     java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>    23     java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>    22     java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>    21     org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:666)
>    20     org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3621)
>    19     org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3038)
>    18     org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2793)
>    17     org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2735)
>    16     org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692)
>    15     org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654)
>    14     org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2029)
>    13     org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213)
>    12     org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2112)
>    11     org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>    10     org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>     9     org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>     8     java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message