Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9E516200D46 for ; Sun, 26 Nov 2017 09:47:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9CE52160BFF; Sun, 26 Nov 2017 08:47:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E28DD160BFA for ; Sun, 26 Nov 2017 09:47:04 +0100 (CET) Received: (qmail 89371 invoked by uid 500); 26 Nov 2017 08:47:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 89360 invoked by uid 99); 26 Nov 2017 08:47:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Nov 2017 08:47:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A7EB41A101B for ; Sun, 26 Nov 2017 08:47:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id GTTMJAfa47rU for ; Sun, 26 Nov 2017 08:47:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2B6335F27D for ; Sun, 26 Nov 2017 08:47:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 8D411E00CD for ; Sun, 26 Nov 2017 08:47:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 46B2623F1D for ; Sun, 26 Nov 2017 08:47:00 +0000 (UTC) Date: Sun, 26 Nov 2017 08:47:00 +0000 (UTC) From: "Eshcar Hillel (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 26 Nov 2017 08:47:05 -0000 [ https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265945#comment-16265945 ] Eshcar Hillel commented on HBASE-18294: --------------------------------------- Ram and Anoop, the reason we see so much heap global pressure is that the regions themselves are not conservative enough to make flush decisions early on. *Changing default values is not a way to fix this inherent problem*: (1) Reducing the threshold may solve the problem for some setting but will not solve it for other settings. For example, in the same experiment if we have the threshold set to 64MB but with twice as much regions we will see the same affect. (2) There are claims pro reducing memstore size like for reducing GC, but there are also claims pro increasing the size to reduce number of flushes, reduce number of compactions and reduce write amplification. (3) In addition, even if we change the default values the system should have optimal performance with the values set by the admin which can be any number. The core changes in this patch focus on the mechanism and decision making for region level flushes, namely evaluate total heap size instead of data size only. The changes at the region server accounting level are mainly cosmetic changes, to make on-heap and off-heap symmetric (why should we ignore the CCM index when it is allocated off-heap, even if it is small, if we can count it the same way we count the CAM index for on-heap?). And I think the changes are not that dramatic about 20 lines of code in {{RegionServerAccounting}}, they do not complicate things much. Can we in-parallel to the discussion here continue with concrete comments on the code in RB so we can converge towards commit. Thanks > Reduce global heap pressure: flush based on heap occupancy > ---------------------------------------------------------- > > Key: HBASE-18294 > URL: https://issues.apache.org/jira/browse/HBASE-18294 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0 > Reporter: Eshcar Hillel > Assignee: Eshcar Hillel > Attachments: HBASE-18294.01.patch, HBASE-18294.02.patch, HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, HBASE-18294.06.patch > > > A region is flushed if its memory component exceed a threshold (default size is 128MB). > A flush policy decides whether to flush a store by comparing the size of the store to another threshold (that can be configured with hbase.hregion.percolumnfamilyflush.size.lower.bound). > Currently the implementation (in both cases) compares the data size (key-value only) to the threshold where it should compare the heap size (which includes index size, and metadata). -- This message was sent by Atlassian JIRA (v6.4.14#64029)