Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25E6A18F53 for ; Tue, 4 Aug 2015 10:23:05 +0000 (UTC) Received: (qmail 11938 invoked by uid 500); 4 Aug 2015 10:23:05 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 11879 invoked by uid 500); 4 Aug 2015 10:23:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 11855 invoked by uid 99); 4 Aug 2015 10:23:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Aug 2015 10:23:04 +0000 Date: Tue, 4 Aug 2015 10:23:04 +0000 (UTC) From: "Duo Zhang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653415#comment-14653415 ] Duo Zhang commented on HBASE-14178: ----------------------------------- Yes, the problem here is the lock, not when to read from cache...So if we can make sure the block will not be put into cache after we fetch it from HDFS, then we can bypass the locking step. > regionserver blocks because of waiting for offsetLock > ----------------------------------------------------- > > Key: HBASE-14178 > URL: https://issues.apache.org/jira/browse/HBASE-14178 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.98.6 > Reporter: Heng Chen > Priority: Critical > Fix For: 0.98.6 > > Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack > > > My regionserver blocks, and all client rpc timeout. > I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: > PS: my table's block cache is off > {code} > "B.DefaultRpcServer.handler=2,queue=2,port=60020" #82 daemon prio=5 os_prio=0 tid=0x0000000001827000 nid=0x2cdc in Object.wait() [0x00007f3831b72000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) > - locked <0x0000000773af7c18> (a org.apache.hadoop.hbase.util.IdLock$Entry) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) > at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) > at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) > at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) > at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) > at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) > - locked <0x00000005e5c55ad0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) > at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - <0x00000005e5c55c08> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)