Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D3B6F200AED for ; Wed, 4 May 2016 00:05:33 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D23B31609F6; Wed, 4 May 2016 00:05:33 +0200 (CEST) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E923E1609F5 for ; Wed, 4 May 2016 00:05:32 +0200 (CEST) Received: (qmail 35012 invoked by uid 500); 3 May 2016 22:05:32 -0000 Mailing-List: contact commits-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list commits@hbase.apache.org Received: (qmail 35003 invoked by uid 99); 3 May 2016 22:05:32 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 May 2016 22:05:32 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id DCB29DFD9F; Tue, 3 May 2016 22:05:31 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: enis@apache.org To: commits@hbase.apache.org Message-Id: <3bdd54d2029e4576aafac5b4c32fa63c@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: hbase git commit: HBASE-15337 Document Date Tiered Compaction in the book Date: Tue, 3 May 2016 22:05:31 +0000 (UTC) archived-at: Tue, 03 May 2016 22:05:34 -0000 Repository: hbase Updated Branches: refs/heads/master f76ffb7f3 -> 66213c9f2 HBASE-15337 Document Date Tiered Compaction in the book Signed-off-by: Enis Soztutar Project: http://git-wip-us.apache.org/repos/asf/hbase/repo Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/66213c9f Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/66213c9f Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/66213c9f Branch: refs/heads/master Commit: 66213c9f28643cfc2ec53d3f683c65a7598b8ace Parents: f76ffb7 Author: Clara Authored: Thu Apr 28 23:31:25 2016 -0700 Committer: Enis Soztutar Committed: Tue May 3 15:05:00 2016 -0700 ---------------------------------------------------------------------- src/main/asciidoc/_chapters/architecture.adoc | 101 +++++++++++++++++++++ 1 file changed, 101 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hbase/blob/66213c9f/src/main/asciidoc/_chapters/architecture.adoc ---------------------------------------------------------------------- diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc index 7cc20e5..faa1230 100644 --- a/src/main/asciidoc/_chapters/architecture.adoc +++ b/src/main/asciidoc/_chapters/architecture.adoc @@ -2060,6 +2060,107 @@ Why? NOTE: This information is now included in the configuration parameter table in <>. +[[ops.date.tiered]] +===== Date Tiered Compaction + +Date tiered compaction is a date-aware store file compaction strategy that is beneficial for time-range scans for time-series data. + +[[ops.date.tiered.when]] +===== When To Use Date Tiered Compactions + +Consider using Date Tiered Compaction for reads for limited time ranges, especially scans of recent data + +Don't use it for + +* random gets without a limited time range +* frequent deletes and updates +* Frequent out of order data writes creating long tails, especially writes with future timestamps +* frequent bulk loads with heavily overlapping time ranges + +.Performance Improvements +Performance testing has shown that the performance of time-range scans improve greatly for limited time ranges, especially scans of recent data. + +[[ops.date.tiered.enable]] +====== Enabling Date Tiered Compaction + +You can enable Date Tiered compaction for a table or a column family, by setting its `hbase.hstore.engine.class` to `org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine`. + +You also need to set `hbase.hstore.blockingStoreFiles` to a high number, such as 60, if using all default settings, rather than the default value of 12). Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age + +You also need to set `hbase.hstore.compaction.max` to the same value as `hbase.hstore.blockingStoreFiles` to unblock major compaction. + +.Procedure: Enable Date Tiered Compaction +. Run one of following commands in the HBase shell. + Replace the table name `orders_table` with the name of your table. ++ +[source,sql] +---- +alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'} +alter 'orders_table', {NAME => 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'}} +create 'orders_table', 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 'hbase.hstore.compaction.max'=>'60'} +---- + +. Configure other options if needed. + See <> for more information. + +.Procedure: Disable Date Tiered Compaction +. Set the `hbase.hstore.engine.class` option to either nil or `org.apache.hadoop.hbase.regionserver.DefaultStoreEngine`. + Either option has the same effect. + Make sure you set the other options you changed to the original settings too. ++ +[source,sql] +---- +alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.DefaultStoreEngine', 'hbase.hstore.blockingStoreFiles' => '12', 'hbase.hstore.compaction.min'=>'6', 'hbase.hstore.compaction.max'=>'12'}} +---- + +When you change the store engine either way, a major compaction will likely be performed on most regions. +This is not necessary on new tables. + +[[ops.date.tiered.config]] +====== Configuring Date Tiered Compaction + +Each of the settings for date tiered compaction should be configured at the table or column family, after disabling the table. +If you use HBase shell, the general command pattern is as follows: + +[source,sql] +---- +alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}} +---- + +[[ops.date.tiered.config.parameters]] +.Tier Parameters + +You can configure your date tiers by changing the settings for the following parameters: + +.Date Tier Parameters +[cols="1,1a", frame="all", options="header"] +|=== +| Setting +| Notes + +|`hbase.hstore.compaction.date.tiered.max.storefile.age.millis` +|Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. + +| `hbase.hstore.compaction.date.tiered.base.window.millis` +| Base window size in milliseconds. Default at 6 hours. + +| `hbase.hstore.compaction.date.tiered.windows.per.tier` +| Number of windows per tier. Default at 4. + +| `hbase.hstore.compaction.date.tiered.incoming.window.min` +| Minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. + +| `hbase.hstore.compaction.date.tiered.window.policy.class` +| The policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. +|=== + +[[ops.date.tiered.config.compaction.throttler]] +.Compaction Throttler + +With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: +Set `hbase.regionserver.throughput.controller` to `org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController`. + +NOTE: For more information about date tiered compaction, please refer to the design specification at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8 [[ops.stripe]] ===== Experimental: Stripe Compactions