Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DFB3619861 for ; Tue, 12 Apr 2016 05:03:25 +0000 (UTC) Received: (qmail 78531 invoked by uid 500); 12 Apr 2016 05:03:25 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 78484 invoked by uid 500); 12 Apr 2016 05:03:25 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 78444 invoked by uid 99); 12 Apr 2016 05:03:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Apr 2016 05:03:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7F3A42C1F56 for ; Tue, 12 Apr 2016 05:03:25 +0000 (UTC) Date: Tue, 12 Apr 2016 05:03:25 +0000 (UTC) From: "Clara Xiong (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15454) Archive store files older than max age MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236596#comment-15236596 ] Clara Xiong commented on HBASE-15454: ------------------------------------- To be specific, it seems very inefficient that we need a routine to slice the data along the exponential windows for minor/major compaction and another concurrent routine to slice the data along the calendar windows to archive them. A user should only need either layout, not both. Either layout satisfies time-range scan efficiency and archive/TTL efficiency. This is the same idea as Dave's pluggable window algorithm. And please add the EC manager code and make it work with both types of windows. To answer your question that the order of archiving differs from compaction, it should be in EC's logic that scan the store file's time range to pick the files to archive. It can share the TTL logic. > Archive store files older than max age > -------------------------------------- > > Key: HBASE-15454 > URL: https://issues.apache.org/jira/browse/HBASE-15454 > Project: HBase > Issue Type: Sub-task > Components: Compaction > Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0 > Reporter: Duo Zhang > Assignee: Duo Zhang > Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0 > > Attachments: HBASE-15454-v1.patch, HBASE-15454.patch > > > Sometimes the old data is rarely touched but we can not remove it. So archive it to several big files(by year or something) and use EC to reduce the redundancy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)