Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 19353 invoked from network); 16 Jun 2009 15:41:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Jun 2009 15:41:20 -0000 Received: (qmail 41677 invoked by uid 500); 16 Jun 2009 15:41:30 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 41574 invoked by uid 500); 16 Jun 2009 15:41:30 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 41546 invoked by uid 99); 16 Jun 2009 15:41:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Jun 2009 15:41:29 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Jun 2009 15:41:27 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 6CA2E29A0013 for ; Tue, 16 Jun 2009 08:41:07 -0700 (PDT) Message-ID: <1367468191.1245166867444.JavaMail.jira@brutus> Date: Tue, 16 Jun 2009 08:41:07 -0700 (PDT) From: "Jonathan Gray (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1207) Fix locking in memcache flush In-Reply-To: <581178188.1235105042843.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720194#action_12720194 ] Jonathan Gray commented on HBASE-1207: -------------------------------------- I have started up a patch to do what you're describing; reusing the open scanners, basically diff'ing what has changed rather than rebuilding. It's certainly possible but requires a bunch of changes, from the ChangedReadersObserver interface to actually adding some stuff to KeyValueScanner. The actual modification of the heap should probably actually happen in the KeyValueHeap. The set of parameters that gets passed through changed readers to KVHeap is (List deleted, StoreFile added) where added always exists but deleted list is sometimes null. I'm happy to finish the patch, but wanted to know if you still think it's a good idea given the complexity. It will be far easier to just rebuild the heap and won't require changing much. Another approach might be to actually do a "diff", looking at current storefiles against what's in the heap. This will prevent us from needing to change the changed observer interface but we'll basically be recalculating what we already know when we make the call to it. If no one comments I will try to get a patch up later today. > Fix locking in memcache flush > ----------------------------- > > Key: HBASE-1207 > URL: https://issues.apache.org/jira/browse/HBASE-1207 > Project: Hadoop HBase > Issue Type: Bug > Affects Versions: 0.19.0 > Reporter: Ben Maurer > Assignee: Jonathan Gray > Fix For: 0.20.0 > > > memcache flushing holds a write lock while it reopens StoreFileScanners. I had a case where this process timed out and caused an exception to be thrown, which made the region server believe it had been unable to flush it's cache and shut itself down. > Stack trace is: > # > "regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher" daemon prio=10 tid=0x00000000562df400 nid=0x15d1 runnable [0x000000004108b000..0x000000004108bd90] > # > java.lang.Thread.State: RUNNABLE > # > at java.util.zip.CRC32.updateBytes(Native Method) > # > at java.util.zip.CRC32.update(CRC32.java:45) > # > at org.apache.hadoop.util.DataChecksum.update(DataChecksum.java:223) > # > at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241) > # > at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:177) > # > at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:194) > # > at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159) > # > - locked <0x00002aaaec1bd2d8> (a org.apache.hadoop.hdfs.DFSClient$BlockReader) > # > at org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1061) > # > - locked <0x00002aaaec1bd2d8> (a org.apache.hadoop.hdfs.DFSClient$BlockReader) > # > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1616) > # > - locked <0x00002aaad1239000> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) > # > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1666) > # > - locked <0x00002aaad1239000> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) > # > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593) > # > - locked <0x00002aaad1239000> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) > # > at java.io.DataInputStream.readInt(DataInputStream.java:371) > # > at org.apache.hadoop.hbase.io.SequenceFile$Reader.next(SequenceFile.java:1943) > # > - locked <0x00002aaad1238c38> (a org.apache.hadoop.hbase.io.SequenceFile$Reader) > # > at org.apache.hadoop.hbase.io.SequenceFile$Reader.next(SequenceFile.java:1844) > # > - locked <0x00002aaad1238c38> (a org.apache.hadoop.hbase.io.SequenceFile$Reader) > # > at org.apache.hadoop.hbase.io.SequenceFile$Reader.next(SequenceFile.java:1890) > # > - locked <0x00002aaad1238c38> (a org.apache.hadoop.hbase.io.SequenceFile$Reader) > # > at org.apache.hadoop.hbase.io.MapFile$Reader.next(MapFile.java:525) > # > - locked <0x00002aaad1238b80> (a org.apache.hadoop.hbase.io.HalfMapFileReader) > # > at org.apache.hadoop.hbase.io.HalfMapFileReader.next(HalfMapFileReader.java:192) > # > - locked <0x00002aaad1238b80> (a org.apache.hadoop.hbase.io.HalfMapFileReader) > # > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getNext(StoreFileScanner.java:312) > # > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:110) > # > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.updateReaders(StoreFileScanner.java:378) > # > at org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:737) > # > at org.apache.hadoop.hbase.regionserver.HStore.updateReaders(HStore.java:725) > # > at org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:694) > # > - locked <0x00002aaab7b41d30> (a java.lang.Integer) > # > at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:630) > # > at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:881) > # > at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:789) > # > at org.apache.hadoop.hbase.regionserver.MemcacheFlusher.flushRegion(MemcacheFlusher.java:227) > # > at org.apache.hadoop.hbase.regionserver.MemcacheFlusher.run(MemcacheFlusher.java:137) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.