Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E0FEDDB97 for ; Tue, 14 May 2013 18:13:17 +0000 (UTC) Received: (qmail 95343 invoked by uid 500); 14 May 2013 18:13:16 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 95265 invoked by uid 500); 14 May 2013 18:13:16 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 95256 invoked by uid 99); 14 May 2013 18:13:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 May 2013 18:13:16 +0000 Date: Tue, 14 May 2013 18:13:16 +0000 (UTC) From: "Enis Soztutar (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-8547) Fix java.lang.RuntimeException: Cached an already cached block MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Enis Soztutar created HBASE-8547: ------------------------------------ Summary: Fix java.lang.RuntimeException: Cached an already cac= hed block Key: HBASE-8547 URL: https://issues.apache.org/jira/browse/HBASE-8547 Project: HBase Issue Type: Bug Components: io, regionserver Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.94.8, 0.95.1 In one test, one of the region servers received the following on 0.94.=20 Note HalfStoreFileReader in the stack trace. I think the root cause is that= after the region is split, the mid point can be in the middle of the block= (for store files that the mid point is not chosen from). Each half store f= ile tries to load the half block and put it in the block cache. Since IdLoc= k is instantiated per store file reader, they do not share the same IdLock = instance, thus does not lock against each other effectively.=20 {code} 2013-05-12 01:30:37,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionS= erver:=C2=B7 java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCach= e.java:279) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2= .java:353) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.load= DataBlockWithScanInfo(HFileBlockIndex.java:254) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekT= o(HFileReaderV2.java:480) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekT= o(HFileReaderV2.java:501) at org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileR= eader.java:237) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(St= oreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileSc= anner.java:145) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(Stor= eFileScanner.java:351) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueH= eap.java:354) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyV= alueHeap.java:312) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValue= Heap.java:277) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.= java:543) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.ja= va:411) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.ja= va:143) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populat= eResult(HRegion.java:3829) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInt= ernal(HRegion.java:3896) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw= (HRegion.java:3778) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw= (HRegion.java:3770) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.= java:2643) at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess= orImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngin= e.java:308) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1= 426) {code} I can see two possible fixes:=20 # Allow this kind of rare cases in LruBlockCache by not throwing an except= ion.=20 # Move the lock instances to upper layer (possibly in CacheConfig), and le= t half hfile readers share the same IdLock implementation.=20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira