hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15454) Archive store files older than max age
Date Wed, 11 May 2016 19:40:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280671#comment-15280671
] 

Mikhail Antonov commented on HBASE-15454:
-----------------------------------------

>From my standpoint if we release a completely new feature and people start using it and
find some edge case -that's fine and expected :) we'll fix it and release in 1.3.1 or 1.4.0.
As long as the complexity doesn't spread around too much and we keep everything possible private
and mark it as experimental/unstable interface-wise, I'm fine with that. If I have to choose
between new feature with known limitations, and new feature where those limitations are addressed
(in a way which we can find non-ideal and fix later) i'd go for latter.

I haven't yet look at this part of the code, so I don't have informed opinion. I'll try to
look at it next few days.

[~tedyu] [~enis] do you guys have any opinion here?

> Archive store files older than max age
> --------------------------------------
>
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.20
>
>         Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, HBASE-15454-v3.patch,
HBASE-15454-v4.patch, HBASE-15454-v5.patch, HBASE-15454-v6.patch, HBASE-15454-v7.patch, HBASE-15454.patch
>
>
> In date tiered compaction, the store files older than max age are never touched by minor
compactions. Here we introduce a 'freeze window' operation, which does the follow things:
> 1. Find all store files that contains cells whose timestamp are in the give window.
> 2. Compaction all these files and output one file for each window that these files covered.
> After the compaction, we will have only one in the give window, and all cells whose timestamp
are in the give window are in the only file. And if you do not write new cells with an older
timestamp in this window, the file will never be changed. This makes it easier to do erasure
coding on the freezed file to reduce redundence. And also, it makes it possible to check consistency
between master and peer cluster incrementally.
> And why use the word 'freeze'?
> Because there is already an 'HFileArchiver' class. I want to use a different word to
prevent confusing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message