hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
Date Thu, 06 Aug 2015 05:41:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659524#comment-14659524
] 

Duo Zhang commented on HBASE-14178:
-----------------------------------

Yes, we doing more round of check with lock because maybe another thread has already cache
the block for us. Things happen here is we disable BC for the given family, so it is impossible
that another thread will do the work for us, so we just read from HDFS and bypass the second
checking BC round.

And as I mentioned above, there are lots of configurations for BC, and {{family.isBlockCacheEnabled()}}
is treated as {{cacheDataOnRead}} (You can see the code pasted by [~chenheng], maybe it is
a mistake but it is not important for this issue I think, we could open another issue for
it). So the safe way to determine if we need the second 'read BC with lock' round is to check
if we will put the block back to BC after we read it from HDFS. This is why we introduce a
{{shouldLockOnCacheMiss}} method here. Maybe we cound change the name to {{shouldReadAgainWithLockOnCacheMiss}}?

Thanks.

> regionserver blocks because of waiting for offsetLock
> -----------------------------------------------------
>
>                 Key: HBASE-14178
>                 URL: https://issues.apache.org/jira/browse/HBASE-14178
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.6
>            Reporter: Heng Chen
>            Priority: Critical
>             Fix For: 0.98.6
>
>         Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch,
HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch,
jstack
>
>
> My regionserver blocks, and all client rpc timeout. 
> I print the regionserver's jstack,  it seems a lot of threads were blocked for waiting
offsetLock, detail infomation belows:
> PS:  my table's block cache is off
> {code}
> "B.DefaultRpcServer.handler=2,queue=2,port=60020" #82 daemon prio=5 os_prio=0 tid=0x0000000001827000
nid=0x2cdc in Object.wait() [0x00007f3831b72000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79)
>         - locked <0x0000000773af7c18> (a org.apache.hadoop.hbase.util.IdLock$Entry)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352)
>         at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173)
>         at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820)
>         - locked <0x00000005e5c55ad0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916)
>         at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>         at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>         - <0x00000005e5c55c08> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message