Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Sun, 26 Nov 2017 08:47:00 +0000 (UTC)
From: "Eshcar Hillel (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.13083414.1498730051000.319212.1511686020280@Atlassian.JIRA>
In-Reply-To: <JIRA.13083414.1498730051000@Atlassian.JIRA>
References: <JIRA.13083414.1498730051000@Atlassian.JIRA> <JIRA.13083414.1498730051371@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush
 based on heap occupancy
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Sun, 26 Nov 2017 08:47:05 -0000


    [ https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265945#comment-16265945 ] 

Eshcar Hillel commented on HBASE-18294:
---------------------------------------

Ram and Anoop, the reason we see so much heap global pressure is that the regions themselves are not conservative enough to make flush decisions early on. *Changing default values is not a way to fix this inherent problem*:
(1) Reducing the threshold may solve the problem for some setting but will not solve it for other settings. For example, in the same experiment if we have the threshold set to 64MB but with twice as much regions we will see the same affect.
(2) There are claims pro reducing memstore size like for reducing GC, but there are also claims pro increasing the size to reduce number of flushes, reduce number of compactions and reduce write amplification.
(3) In addition, even if we change the default values the system should have optimal performance with the values set by the admin which can be any number.

The core changes in this patch focus on the mechanism and decision making for region level flushes, namely evaluate total heap size instead of data size only. 
The changes at the region server accounting level are mainly cosmetic changes, to make on-heap and off-heap symmetric (why should we ignore the CCM index when it is allocated off-heap, even if it is small, if we can count it the same way we count the CAM index for on-heap?).
And I think the changes are not that dramatic about 20 lines of code in {{RegionServerAccounting}}, they do not complicate things much.

Can we in-parallel to the discussion here continue with concrete comments on the code in RB so we can converge towards commit. 
Thanks

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>         Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, HBASE-18294.06.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the store to another threshold (that can be configured with hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size (key-value only) to the threshold where it should compare the heap size (which includes index size, and metadata).


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)