hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3048) unify code for major/minor compactions
Date Wed, 29 Sep 2010 03:53:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916006#action_12916006
] 

stack commented on HBASE-3048:
------------------------------

This is fine by me.  The one objection I was going to raise was the delete markers story but
you got that in your footnote.

> unify code for major/minor compactions
> --------------------------------------
>
>                 Key: HBASE-3048
>                 URL: https://issues.apache.org/jira/browse/HBASE-3048
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> Today minor compactions do not process deletes, purge old versions, etc. Only major compactions
do.  The rationale was probably to save CPU (?). We should evaluate if major compaction logic
indeed runs significantly slower.
> Unifying minor compactions to do the same thing as major compactions has other advantages:
> * If the same data is overwritten several times and we are not processing overwrites,
it makes each subsequent minor compaction more expensive as the total amount of data.
> * We'll have fewer bugs if the logic is as symmetric as possible. Any bugs in TTL enforcement,
version enforcement, etc. could cause behavior to be different after a major compaction. Keeping
the same logic means these bugs will get caught earlier.
> -
> Note: There will still need to be one difference in the two schemes, and that has to
do with delete markers. Any compaction which doesn't compact all files will still need to
leave delete markers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message