Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 89175 invoked from network); 6 Apr 2011 20:58:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Apr 2011 20:58:46 -0000 Received: (qmail 24111 invoked by uid 500); 6 Apr 2011 20:58:46 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 24087 invoked by uid 500); 6 Apr 2011 20:58:46 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 24079 invoked by uid 99); 6 Apr 2011 20:58:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 20:58:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 20:58:43 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 1FC7495189 for ; Wed, 6 Apr 2011 20:58:06 +0000 (UTC) Date: Wed, 6 Apr 2011 20:58:06 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: issues@hbase.apache.org Message-ID: <1181758226.38724.1302123486127.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (HBASE-3745) Add the ability to restrict major-compactible files by timestamp MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Add the ability to restrict major-compactible files by timestamp ---------------------------------------------------------------- Key: HBASE-3745 URL: https://issues.apache.org/jira/browse/HBASE-3745 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Todd Lipcon In some applications, a common access pattern is to frequently scan tables with a time range predicate restricted to a fairly recent time window. For example, you may want to do an incremental aggregation or indexing step only on rows that have changed in the last hour. We do this efficiently by tracking min and max timestamp on an HFile level, so that old HFiles don't have to be read. After a major compaction, however, the entire dataset will need to be read, which can hurt performance of this access pattern. We should add a column family attribute that can specify a policy like: When major compacting, never include an HFile that contains data with a timestamp in the last 4 hours. This, recently flushed HFiles will always be uncompacted and provide the good scan performance required for these applications. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira