Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF45B7837 for ; Thu, 25 Aug 2011 10:47:11 +0000 (UTC) Received: (qmail 63240 invoked by uid 500); 25 Aug 2011 10:47:11 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 62934 invoked by uid 500); 25 Aug 2011 10:46:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 62902 invoked by uid 99); 25 Aug 2011 10:46:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Aug 2011 10:46:53 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Aug 2011 10:46:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id CCE50CEF36 for ; Thu, 25 Aug 2011 10:46:30 +0000 (UTC) Date: Thu, 25 Aug 2011 10:46:30 +0000 (UTC) From: "jiraposter@reviews.apache.org (JIRA)" To: issues@hbase.apache.org Message-ID: <554715563.13137.1314269190836.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1082745424.4271.1314074129347.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4241) Optimize flushing of the Store cache for max versions and (new) min versions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090921#comment-13090921 ] jiraposter@reviews.apache.org commented on HBASE-4241: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1650/#review1636 ----------------------------------------------------------- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java The more we put into finally block, the higher the chance that the last close() may be skipped by exception from previous calls to writer (lines 518 and 520). Ideally calls to writer should be enclosed in try/catch blocks. - Ted On 2011-08-25 05:38:13, Lars Hofhansl wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1650/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-08-25 05:38:13) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray. bq. bq. bq. Summary bq. ------- bq. bq. This avoids flushing row versions to disk that are known to be GC'd by the next compaction anyway. bq. This covers two scenarios: bq. 1. maxVersions=N and we find at least N versions in the memstore. We can safely avoid flushing any further versions to disk. bq. 2. similarly minVersions=N and we find at least N versions in the memstore. Now we can safely avoid flushing any further *expired* versions to disk. bq. bq. This changes the Store flush to use the same mechanism that used for compactions. bq. I borrowed some code from the tests and refactored the test code to use a new utility class that wraps a sorted collection and then behaves like KeyValueScanner. The same class is used to create scanner over the memstore's snapshot. bq. bq. bq. This addresses bug HBASE-4241. bq. https://issues.apache.org/jira/browse/HBASE-4241 bq. bq. bq. Diffs bq. ----- bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1161347 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java 1161347 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java 1161347 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueScanFixture.java 1161347 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 1161347 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 1161347 bq. bq. Diff: https://reviews.apache.org/r/1650/diff bq. bq. bq. Testing bq. ------- bq. bq. Ran all tests. TestHTablePool and TestDistributedLogSplitting error out (with or without my change). bq. I had to change three tests that incorrectly relied on old rows hanging around after a flush (or were otherwise incorrect). bq. bq. No new test, as this should cause no functional change. bq. bq. bq. Thanks, bq. bq. Lars bq. bq. > Optimize flushing of the Store cache for max versions and (new) min versions > ---------------------------------------------------------------------------- > > Key: HBASE-4241 > URL: https://issues.apache.org/jira/browse/HBASE-4241 > Project: HBase > Issue Type: Improvement > Components: regionserver > Affects Versions: 0.92.0 > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Attachments: 4241-v2.txt, 4241.txt > > > As discussed with with Jon, there is room for improvement in how the memstore is flushed to disk. > Currently only expired KVs are pruned before flushing, but we can also prune versions if we find at least maxVersions versions in the memstore. > The same holds for the new minversion feature: If we find at least minVersion versions in the store we can remove all further versions that are expired. > Generally we should use the same mechanism here that is used for Compaction. I.e. StoreScanner. We only need to add a scanner to Memstore that can scan along the current snapshot. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira