hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clara Xiong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15454) Archive store files older than max age
Date Tue, 12 Apr 2016 05:03:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236596#comment-15236596
] 

Clara Xiong commented on HBASE-15454:
-------------------------------------

To be specific, it seems very inefficient that we need a routine to slice the data  along
the exponential windows for minor/major compaction and another concurrent routine to slice
the data along the calendar windows to archive them.  A user should only need either layout,
not both. Either layout satisfies time-range scan efficiency and archive/TTL efficiency. This
is the same idea as Dave's pluggable window algorithm.

And please add the EC manager code and make it work with both types of windows. To answer
your question that the order of archiving differs from compaction, it should be in EC's logic
that scan the store file's time range to pick the files to archive. It can share the TTL logic.

> Archive store files older than max age
> --------------------------------------
>
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0
>
>         Attachments: HBASE-15454-v1.patch, HBASE-15454.patch
>
>
> Sometimes the old data is rarely touched but we can not remove it. So archive it to several
big files(by year or something) and use EC to reduce the redundancy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message