hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7680) implement compaction policy for stripe compactions
Date Wed, 06 Mar 2013 01:25:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594202#comment-13594202

Sergey Shelukhin commented on HBASE-7680:

The structure of the files is as such.
StripeStoreEngine is implementation of StoreEngine for stripes, should be pretty straightforward.
needsCompaction from CompactionPolicy was moved into storeEngine, and in case of each engine
calls the appropriate method.
StripeCompactor is a placeholder for compactor; does nothing now. Again, due to vast differences
in compactor interfaces general Compactor lost compact methods.
DefaultCompactionPolicy was refactored a bit to extract the application of the ratio algorithm
into a static method for use in stripes. I added minfiles check right in the method; I'll
check whether separate minfile check that default policy does can be removed.

Then, StripeStoreConfig is a base class now; it has two nested sub-classes for size-based
and count-based stripes, with different parameters.
StripeCompactionPolicy is the base policy class that is gene has some common methods, the
biggest of which is finding single-stripe compaction. It's generic on StripeStoreConfig type.
SizeBasedStripeCompactionPolicy and CountBasedStripeCompactionPolicy are the actual implementations
of the two policies.
Tests for each class are straightforward (in terms of mapping test to class); StripeCompactionPolicyTestBase
is a base class that contain various common methods to create mock state, verify things, etc.

Also: discussion of drop-deletes logic is in HBASE-7902.
In short, to drop deletes, we need to add L0 files to compaction to have all store files,
lest we have issues with deletes/puts in the past. Then we can only drop deletes from stripe
files, not L0. Note that we already have the issues with that because of memstore, but the
timing window to get it is relatively small, whereas if we ignore some store files it will
be large.
Therefore, I added a config parameters "assume ordering", which basically tells us the user
is prepared to tolerate it or doesn't use deletes in the past much (I assume this is majority
of cases, but it's off by default). In that case we can drop deletes on the compactions of
entire stripe, ignoring L0.
Then, to avoid blowing up the compaction size/rewrites, I added a ratio config, where L0 will
not be added to compaction to drop deletes unless it's small enough. I will need to think
more about that, as adding very small L0 to compaction can result in a different kind of problem,
too many small files.
Needless to say if there's no L0 (no files), we don't have to add it.
Finally, in order to add L0 files to compaction to drop deletes, we need to know the resulting
stripe boundaries (to split L0), so it's not done for compactions that are based on determining
the boundaries dynamically (e.g. rebalancing, where we determine boundary in compactor based
on data size).

> implement compaction policy for stripe compactions
> --------------------------------------------------
>                 Key: HBASE-7680
>                 URL: https://issues.apache.org/jira/browse/HBASE-7680
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-7680-minus0.5.patch, HBASE-7680-v0.patch, HBASE-7680-v0-with-7679-and-7935.patch,
HBASE-7680-v-minus1.patch, HBASE-7680-v-minus1.patch

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message