hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4241) Optimize flushing of the Store cache for max versions and (new) min versions
Date Thu, 25 Aug 2011 04:51:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090773#comment-13090773
] 

jiraposter@reviews.apache.org commented on HBASE-4241:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1650/
-----------------------------------------------------------

(Updated 2011-08-25 04:49:36.340305)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
-------

Added linebreak.
Added comment that deletes need to be retained.


Summary
-------

This avoids flushing row versions to disk that are known to be GC'd by the next compaction
anyway.
This covers two scenarios:
1. maxVersions=N and we find at least N versions in the memstore. We can safely avoid flushing
any further versions to disk.
2. similarly minVersions=N and we find at least N versions in the memstore. Now we can safely
avoid flushing any further *expired* versions to disk.

This changes the Store flush to use the same mechanism that used for compactions.
I borrowed some code from the tests and refactored the test code to use a new utility class
that wraps a sorted collection and then behaves like KeyValueScanner. The same class is used
to create scanner over the memstore's snapshot.


This addresses bug HBASE-4241.
    https://issues.apache.org/jira/browse/HBASE-4241


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
1161347 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
1161347 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java
1161347 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueScanFixture.java
1161347 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
1161347 
  http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
1161347 

Diff: https://reviews.apache.org/r/1650/diff


Testing
-------

Ran all tests. TestHTablePool and TestDistributedLogSplitting error out (with or without my
change).
I had to change three tests that incorrectly relied on old rows hanging around after a flush
(or were otherwise incorrect).

No new test, as this should cause no functional change.


Thanks,

Lars



> Optimize flushing of the Store cache for max versions and (new) min versions
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4241
>                 URL: https://issues.apache.org/jira/browse/HBASE-4241
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: 4241-v2.txt, 4241.txt
>
>
> As discussed with with Jon, there is room for improvement in how the memstore is flushed
to disk.
> Currently only expired KVs are pruned before flushing, but we can also prune versions
if we find at least maxVersions versions in the memstore.
> The same holds for the new minversion feature: If we find at least minVersion versions
in the store we can remove all further versions that are expired.
> Generally we should use the same mechanism here that is used for Compaction. I.e. StoreScanner.
We only need to add a scanner to Memstore that can scan along the current snapshot.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message