hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15454) Archive store files older than max age
Date Thu, 14 Apr 2016 06:50:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240704#comment-15240704

Duo Zhang commented on HBASE-15454:

1. We have to make sure you get all the files. I know you check fileCompacting.But there are
other checks, please refer to the logic when we try to do major compaction.
Yeah That's why I pass the candidateFiles before any filtering to the tryArchive method. Agree
that there may still be some corner cases, let me check again.

2. We have to avoid re-compaction. maxAge is better set to a number we know additional data
will not arrive after this compaction.
Add some comments on max age and archive file config? And also a detailed release note?

3. We need to make sure this doesn't get starved by frequent minor compactions.
Oh yeah this maybe a problem if the compaction check is not frequent enough...

4. How do we know a store file is ready to be archived? Do we set a flag after this special
minor compaction?
Major compaction can also output archived files. So I think it is better to change something
in the compactor and multi writer?

I will modify the description to better describe what we are trying to do here.


> Archive store files older than max age
> --------------------------------------
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0
>         Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, HBASE-15454.patch
> Sometimes the old data is rarely touched but we can not remove it. So archive it to several
big files(by year or something) and use EC to reduce the redundancy.

This message was sent by Atlassian JIRA

View raw message