Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 936BB200BBE for ; Thu, 27 Oct 2016 17:52:01 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 91F27160AF6; Thu, 27 Oct 2016 15:52:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D940B160B01 for ; Thu, 27 Oct 2016 17:52:00 +0200 (CEST) Received: (qmail 73016 invoked by uid 500); 27 Oct 2016 15:51:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 72653 invoked by uid 99); 27 Oct 2016 15:51:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Oct 2016 15:51:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8EB032C2ABC for ; Thu, 27 Oct 2016 15:51:59 +0000 (UTC) Date: Thu, 27 Oct 2016 15:51:59 +0000 (UTC) From: "Anoop Sam John (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 27 Oct 2016 15:52:01 -0000 [ https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15612299#comment-15612299 ] Anoop Sam John commented on HBASE-16417: ---------------------------------------- In memory index merge helps in scan case, I agree to ur argument. Ya am not saying we should never do it. Ya let us experiment with diff numbers as ur current tests. My point was that flush only tail is with the assumption that we will merge eagerly and most of the memstore size is tail of pipeline. But when we play with diff #segments for merge, (say 4) we will end up in much smaller sized segment in tail and so smaller sized flush. Actually speaking flush all is not directly related to merge. So my point is let us have such a mechanism so that we can better test the impact of merge alone.. Now what happens is we will avoid frequent merges (by upping the #segments for merge) so reduce merge cost but that will have an impact on flush size. So if we can avoid that we can test merge cost much in isolation. bq.We just disagree on marking the opposite case (many duplicates) as special and going down a different code path there - because if we leave it to the admin as non-default we know what'll happen. That is the data compaction path right? That is config driven and is in already. you say that also u will auto tune? That will be super cool if done. Am speaking on what is available as of today. > In-Memory MemStore Policy for Flattening and Compactions > -------------------------------------------------------- > > Key: HBASE-16417 > URL: https://issues.apache.org/jira/browse/HBASE-16417 > Project: HBase > Issue Type: Sub-task > Reporter: Anastasia Braginsky > Assignee: Anastasia Braginsky > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)