hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3048) unify code for major/minor compactions
Date Sat, 06 Nov 2010 22:52:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929261#action_12929261

HBase Review Board commented on HBASE-3048:

Message from: "Jonathan Gray" <jgray@apache.org>

This is an automatically generated e-mail. To reply, visit:

(Updated 2010-11-06 15:50:30.615330)

Review request for hbase, stack, Kannan Muthukkaruppan, and Ryan Rawson.


Minor cleanup of whitespace and stuff.


This is a rebased patch from Amit.  This was internally reviewed but we may want some minor
modifications for 0.90 with the new InternalScanner, if possible.  (suggestion from ryan on

This addresses bug HBASE-3048.

Diffs (updated)

  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 1032175 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1032175 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1032175 
  trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java 1032175 

Diff: http://review.cloudera.org/r/1185/diff


Lot's of good new additions by Amit to TestCompaction which passes.



> unify code for major/minor compactions
> --------------------------------------
>                 Key: HBASE-3048
>                 URL: https://issues.apache.org/jira/browse/HBASE-3048
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Amitanand Aiyer
>         Attachments: HBASE-3048-0.90-v2.patch, unify.patch
> Today minor compactions do not process deletes, purge old versions, etc. Only major compactions
do.  The rationale was probably to save CPU (?). We should evaluate if major compaction logic
indeed runs significantly slower.
> Unifying minor compactions to do the same thing as major compactions has other advantages:
> * If the same keys are deleted/updated repeatedly, the fact that deletes/overwrites are
not processed during minor compaction makes each subsequent minor compaction more expensive
as the total amount of data keeps growing.
> * We'll have fewer bugs if the logic is as symmetric as possible. Any bugs in TTL enforcement,
version enforcement, etc. could cause behavior to be different after a major compaction. Keeping
the same logic means these bugs will get caught earlier.
> -
> Note: There will still need to be one difference in the two schemes, and that has to
do with delete markers. Any compaction which doesn't compact all files will still need to
leave delete markers.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message