Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AB1BD17EE9 for ; Wed, 9 Sep 2015 00:14:46 +0000 (UTC) Received: (qmail 57422 invoked by uid 500); 9 Sep 2015 00:14:46 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 57332 invoked by uid 500); 9 Sep 2015 00:14:46 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 57154 invoked by uid 99); 9 Sep 2015 00:14:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Sep 2015 00:14:46 +0000 Date: Wed, 9 Sep 2015 00:14:46 +0000 (UTC) From: "Vladimir Rodionov (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-14383) Compaction improvements MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-14383: -------------------------------------- Description: Still major issue in many production environments. The general recommendation - disabling region splitting and major compactions to reduce unpredictable IO/CPU spikes, especially during peak times and running them manually during off peak times. Still do not resolve the issues completely. h3. Flush storms * rolling WAL events across cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. These events are highly correlated in time if there is a balanced write-load on the regions in a table. * the same is true for memstore flushing due to periodic memstore flusher operation. Both above may produce *flush storms* which are as bad as *compaction storms*. What can be done here. We can spread these events over time by randomizing (with jitter) several config options: # hbase.regionserver.optionalcacheflushinterval # hbase.regionserver.flush.per.changes # hbase.regionserver.maxlogs h3. ExploringCompactionPolicy max compaction size One more optimization can be added to ExploringCompactionPolicy. To limit size of a compaction there is a config parameter one could use hbase.hstore.compaction.max.size. It would be nice to have two separate limits: for peak and off peak hours. h3. ExploringCompactionPolicy selection evaluation algorithm Just seems too simple: selection with more files always wins, selection of smaller size wins if number of files is the same. was: Still major issue in many production environments. The general recommendation - disabling region splitting and major compactions to reduce unpredictable IO/CPU spikes, especially during peak times and running them manually during off peak times. Still do not resolve the issues completely. h3. Flush storms * rolling WAL events across cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. * the same is true for memstore flushing due to periodic memstore flusher operation. These events are highly correlated in time if there is a balanced write-load on the regions in a table. Both above may produce *flush storms* which are as bad as *compaction storms*. What can be done here. We can spread these events over time by randomizing (with jitter) several config options: # hbase.regionserver.optionalcacheflushinterval # hbase.regionserver.flush.per.changes # hbase.regionserver.maxlogs h3. ExploringCompactionPolicy max compaction size One more optimization can be added to ExploringCompactionPolicy. To limit size of a compaction there is a config parameter one could use hbase.hstore.compaction.max.size. It would be nice to have two separate limits: for peak and off peak hours. h3. ExploringCompactionPolicy selection evaluation algorithm Just seems too simple: selection with more files always wins, selection of smaller size wins if number of files is the same. > Compaction improvements > ----------------------- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement > Reporter: Vladimir Rodionov > Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation - disabling region splitting and major compactions to reduce unpredictable IO/CPU spikes, especially during peak times and running them manually during off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. These events are highly correlated in time if there is a balanced write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher operation. > Both above may produce *flush storms* which are as bad as *compaction storms*. > What can be done here. We can spread these events over time by randomizing (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit size of a compaction there is a config parameter one could use hbase.hstore.compaction.max.size. It would be nice to have two separate limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Just seems too simple: selection with more files always wins, selection of smaller size wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)