Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C2247180CE for ; Thu, 28 Jan 2016 18:53:45 +0000 (UTC) Received: (qmail 1274 invoked by uid 500); 28 Jan 2016 18:53:40 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 1212 invoked by uid 500); 28 Jan 2016 18:53:40 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 1135 invoked by uid 99); 28 Jan 2016 18:53:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jan 2016 18:53:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 11C312C1F6F for ; Thu, 28 Jan 2016 18:53:40 +0000 (UTC) Date: Thu, 28 Jan 2016 18:53:40 +0000 (UTC) From: "Clara Xiong (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: -------------------------------- Description: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing was: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. Time range overlapping among store files is tolerated and the performance impact is minimized. > A simple implementation of date based tiered compaction > ------------------------------------------------------- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction > Reporter: Clara Xiong > Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch > > > This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. > Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)