Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C6A9BDB8C for ; Wed, 15 May 2013 14:11:20 +0000 (UTC) Received: (qmail 4147 invoked by uid 500); 15 May 2013 14:11:20 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 4091 invoked by uid 500); 15 May 2013 14:11:20 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 3921 invoked by uid 99); 15 May 2013 14:11:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 May 2013 14:11:19 +0000 Date: Wed, 15 May 2013 14:11:19 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8547) Fix java.lang.RuntimeException: Cached an already cached block MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8547?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1365= 8384#comment-13658384 ]=20 Hudson commented on HBASE-8547: ------------------------------- Integrated in hbase-0.95-on-hadoop2 #100 (See [https://builds.apache.org/jo= b/hbase-0.95-on-hadoop2/100/]) HBASE-8547. Fix java.lang.RuntimeException: Cached an already cached bl= ock (Addendum to add better log) (Revision 1482706) HBASE-8547 Fix java.lang.RuntimeException: Cached an already cached block (= Revision 1482698) Result =3D FAILURE enis :=20 Files :=20 * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/i= o/hfile/LruBlockCache.java enis :=20 Files :=20 * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/i= o/hfile/LruBlockCache.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/i= o/hfile/TestLruBlockCache.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/u= til/TestIdLock.java =20 > Fix java.lang.RuntimeException: Cached an already cached block > -------------------------------------------------------------- > > Key: HBASE-8547 > URL: https://issues.apache.org/jira/browse/HBASE-8547 > Project: HBase > Issue Type: Bug > Components: io, regionserver > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 0.98.0, 0.94.8, 0.95.1 > > Attachments: hbase-8547_v1-0.94.patch, hbase-8547_v1-0.94.patch, = hbase-8547_v1.patch, hbase-8547_v2-0.94-reduced.patch, hbase-8547_v2-trunk.= patch > > > In one test, one of the region servers received the following on 0.94.=20 > Note HalfStoreFileReader in the stack trace. I think the root cause is th= at after the region is split, the mid point can be in the middle of the blo= ck (for store files that the mid point is not chosen from). Each half store= file tries to load the half block and put it in the block cache. Since IdL= ock is instantiated per store file reader, they do not share the same IdLoc= k instance, thus does not lock against each other effectively.=20 > {code} > 2013-05-12 01:30:37,733 ERROR org.apache.hadoop.hbase.regionserver.HRegio= nServer:=C2=B7 > java.lang.RuntimeException: Cached an already cached block > at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCa= che.java:279) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReader= V2.java:353) > at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.lo= adDataBlockWithScanInfo(HFileBlockIndex.java:254) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.see= kTo(HFileReaderV2.java:480) > at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.see= kTo(HFileReaderV2.java:501) > at org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFil= eReader.java:237) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(= StoreFileScanner.java:226) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFile= Scanner.java:145) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(St= oreFileScanner.java:351) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValu= eHeap.java:354) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(Ke= yValueHeap.java:312) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyVal= ueHeap.java:277) > at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanne= r.java:543) > at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.= java:411) > at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.= java:143) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.popul= ateResult(HRegion.java:3829) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextI= nternal(HRegion.java:3896) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextR= aw(HRegion.java:3778) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextR= aw(HRegion.java:3770) > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServe= r.java:2643) > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcce= ssorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEng= ine.java:308) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java= :1426) > {code} > I can see two possible fixes:=20 > # Allow this kind of rare cases in LruBlockCache by not throwing an exce= ption.=20 > # Move the lock instances to upper layer (possibly in CacheConfig), and = let half hfile readers share the same IdLock implementation.=20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira