hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Latham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15454) Archive store files older than max age
Date Wed, 13 Apr 2016 16:17:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239522#comment-15239522

Dave Latham commented on HBASE-15454:

Please forgive me - I still don't understand, and I want to.  It would be really helpful for
me if you can try to answer these questions:
* What does it actually mean to "archive" a store file? Is there a definition, or set of properties
or guarantees?
** Are archived files excluded from major compaction? Or minor compactions? Or from region
split size calculation?
** Are archived files guaranteed to have no timestamp overlap with other HFiles? Or just other
archived HFiles?
** Or does it just refer to any files with max timestamp older than maxAge?

Without understanding how archived files are different from other HFiles I don't see why it
needs separate logic beyond purely having a pluggable window factory (which is nice to see
in the v2 patch).

Currently we do have a config for store files that is no longer eligible for minor compaction,
which is max age
Yikes.  I thought max age was purely part of the exponential tiered windowing schedule, which
stopped the growth of tiers past a certain point.  Under common write patterns those files
would then never need minor compactions again, but if there were actually several files in
such a window I wouldn't want to explicitly prevent compaction of them.

> Archive store files older than max age
> --------------------------------------
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0
>         Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, HBASE-15454.patch
> Sometimes the old data is rarely touched but we can not remove it. So archive it to several
big files(by year or something) and use EC to reduce the redundancy.

This message was sent by Atlassian JIRA

View raw message