hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction
Date Tue, 02 Feb 2016 01:18:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127380#comment-15127380
] 

Enis Soztutar commented on HBASE-15181:
---------------------------------------

HBASE-7763 is the jira that talks about why we need to select contiguous set of files for
compaction. The main idea is that if two puts happen with the same timestamp, we are ordering
them using the sequenceId so that the "latest" one is returned always. This allows the user
to override a previously set value for example in some cases. 
​
The problem with non-contiguous compactions is that, we do not keep the seqids of cells forever.
After some time, we remove per-cell seqIds and only keep 1 sequenceId per hfile. Thus if we
end up with two different puts having different seqIds in files, but with same timestamp,
then allowing non-contiguous compactions may break the ordering. 

For example: 
{code}
file1: seqId=10, row=foo, val=v1 ts = 100 
file2: seqId=20, row=bar, val=v2, ts=200
file3: seqId=30, row=foo, val=v3, ts = 100 
file4: seqId=40, row=bar, val=v4, ts=300
{code}
If I compact file1 and file4 together, then the new file will have <row=foo, ts=100, val=v1,
seqId=40>, although <row=foo, ts=100, val=v3, seqId=30> should be the correct answer.
The bad thing is that if you are doing a query for reading the value of row=foo, the returned
result will change based on whether compaction is run or not. 

What I was saying offline is that we can actually do something like HBASE-9905 and disallow
client-settable timestamps, or do something like HBASE-10247​ where the table pre-declares
that we won't have same-ts edits, it should be possible to do non-contigous compactions. 

​It seems that HBASE-3690 ​introduced a config option to exclude bulk loaded files to
be excluded from minor compaction. This is to prevent compaction storms due to bulk load,
but it is off by default and made configurable. Somewhere down the line, the conf option got
removed, but I was not able to trace that. Maybe a bug? 

Some more background is here: 
https://issues.apache.org/jira/browse/HBASE-8770
https://issues.apache.org/jira/browse/HBASE-8721

> A simple implementation of date based tiered compaction
> -------------------------------------------------------
>
>                 Key: HBASE-15181
>                 URL: https://issues.apache.org/jira/browse/HBASE-15181
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch
>
>
> This is a simple implementation of date-based tiered compaction similar to Cassandra's
for the following benefits:
> 1. Improve date-range-based scan by structuring store files in date-based tiered layout.
> 2. Reduce compaction overhead.
> 3. Improve TTL efficiency.
> Perfect fit for the use cases that:
> 1. has mostly date-based date write and scan and a focus on the most recent data. 
> 2. never or rarely deletes data.
> Out-of-order writes are handled gracefully so the data will still get to the right store
file for time-range-scan and re-compacton with existing store file in the same time window
is handled by ExploringCompactionPolicy.
> Time range overlapping among store files is tolerated and the performance impact is minimized.
> Configuration can be set at hbase-site or overriden at per-table or per-column-famly
level by hbase shell.
> Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message