Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 44E67200C25 for ; Fri, 24 Feb 2017 10:44:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 43827160B69; Fri, 24 Feb 2017 09:44:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8E501160B5C for ; Fri, 24 Feb 2017 10:44:46 +0100 (CET) Received: (qmail 30940 invoked by uid 500); 24 Feb 2017 09:44:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 30924 invoked by uid 99); 24 Feb 2017 09:44:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Feb 2017 09:44:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 21A7E1A0369 for ; Fri, 24 Feb 2017 09:44:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.547 X-Spam-Level: X-Spam-Status: No, score=-1.547 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-2.999, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 246n8PclkPxV for ; Fri, 24 Feb 2017 09:44:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 4247361E2B for ; Fri, 24 Feb 2017 09:26:15 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 64C5CE0933 for ; Fri, 24 Feb 2017 09:25:44 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1C1A724136 for ; Fri, 24 Feb 2017 09:25:44 +0000 (UTC) Date: Fri, 24 Feb 2017 09:25:44 +0000 (UTC) From: "Anoop Sam John (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17338) Treat Cell data size under global memstore heap size only when that Cell can not be copied to MSLAB MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 24 Feb 2017 09:44:47 -0000 [ https://issues.apache.org/jira/browse/HBASE-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882303#comment-15882303 ] Anoop Sam John commented on HBASE-17338: ---------------------------------------- Thanks for the reviews Stack & Ram.. I have to fix the failed test as those were having assert on heapOverhead size after cell addition to Memstore. Now we track heapSize not overhead. Will add more comments as Stack asked Regarding Ram's comment. I have different opinion. I dont think we need change this way. We track dataSize (irrespective of cell data in on heap or off heap area).. This dataSize been used at Segment level for in memory flush decision, Region level for on disk flushes and globally to force flush some regions. At the 1st 2 levels, it is not doubt that we have to track all the cell data size together. Now the point Ram says is when we have off heap configured and max off heap global size is say 12 GB, once the data size globally reaches this level, we will force flush some regions. So his point is for this tracking, we have to consider only off heap Cells and on heap Cell's data size should not get accounted in the data size but only in the heapSize. (At global level. But at region and segment level it has to get applied). 2 reasons why I am not in favor of this 1. This makes the impl so complex. We need to add isOffheap check down the layers. Also at 2 layers we have to consider these on heap cell data size and one level not. 2. When off heap is enabled, (We have the MSLAB pool off heap in place), we will end up in having on heap Cells when the pool is out of BBs. We will create on demand LABs on heap. If we dont consider those cell's data size at global level, we may reach forced flush level a bit late. That is the only gain. But here the on demand LAB creation is a sign that the write load is so high and delaying the flush will add more pressure to the MSLAB and more and more on demand BBs (2 MB sized) need to get created. One aim of the off heap work is to reduce the max heap space need for the servers. So lets consider the cell data size globally also (how we do now) and make global flushes. Now even if MSLAB is used, the append/increment use cases wont use MSLAB. The cells will be on heap then. But for such a use case, enabling MSLAB (and pool) is totally waste. That is a mis configuration. More and more on demand BB creation, when MSLAB pool is a bad sign. We have a WARN log for just one time as of now.. May be we should repeat this log at certain intervals. (Like for every 100th pool miss or so.) We should be able to turn MSLAB usage ON/OFF per table also. Now this is possible? Am not sure. These 2 things need to be checked and done in another jiras IMO. > Treat Cell data size under global memstore heap size only when that Cell can not be copied to MSLAB > --------------------------------------------------------------------------------------------------- > > Key: HBASE-17338 > URL: https://issues.apache.org/jira/browse/HBASE-17338 > Project: HBase > Issue Type: Sub-task > Components: regionserver > Affects Versions: 2.0.0 > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-17338.patch, HBASE-17338_V2.patch, HBASE-17338_V2.patch, HBASE-17338_V4.patch > > > We have only data size and heap overhead being tracked globally. Off heap memstore works with off heap backed MSLAB pool. But a cell, when added to memstore, not always getting copied to MSLAB. Append/Increment ops doing an upsert, dont use MSLAB. Also based on the Cell size, we sometimes avoid MSLAB copy. But now we track these cell data size also under the global memstore data size which indicated off heap size in case of off heap memstore. For global checks for flushes (against lower/upper watermark levels), we check this size against max off heap memstore size. We do check heap overhead against global heap memstore size (Defaults to 40% of xmx) But for such cells the data size also should be accounted under the heap overhead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)