Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B438418F8B for ; Mon, 1 Feb 2016 19:46:38 +0000 (UTC) Received: (qmail 40689 invoked by uid 500); 1 Feb 2016 19:45:40 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 40637 invoked by uid 500); 1 Feb 2016 19:45:40 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 40607 invoked by uid 99); 1 Feb 2016 19:45:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Feb 2016 19:45:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E2DAD2C1F68 for ; Mon, 1 Feb 2016 19:45:39 +0000 (UTC) Date: Mon, 1 Feb 2016 19:45:39 +0000 (UTC) From: "Vladimir Rodionov (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126893#comment-15126893 ] Vladimir Rodionov commented on HBASE-15181: ------------------------------------------- Thanks for the patch, [~claraxiong] You still rely on default major compaction for bulk loaded files, periodic major compactions are disabled, therefore the only way to compact bulk loaded files is to force major compaction manually. For many applications the new compaction policy won't give much benefit - they periodically do batch load and they will have to run major compaction after on a daily basis. Have you thought about that? > A simple implementation of date based tiered compaction > ------------------------------------------------------- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction > Reporter: Clara Xiong > Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. > Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)