Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2352C200BD1 for ; Mon, 28 Nov 2016 11:57:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 22596160B22; Mon, 28 Nov 2016 10:57:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78A97160B26 for ; Mon, 28 Nov 2016 11:57:00 +0100 (CET) Received: (qmail 20820 invoked by uid 500); 28 Nov 2016 10:56:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 20462 invoked by uid 99); 28 Nov 2016 10:56:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Nov 2016 10:56:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 150F42C03EC for ; Mon, 28 Nov 2016 10:56:59 +0000 (UTC) Date: Mon, 28 Nov 2016 10:56:59 +0000 (UTC) From: "Anastasia Braginsky (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17081) Flush the entire CompactingMemStore content to disk MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 28 Nov 2016 10:57:01 -0000 [ https://issues.apache.org/jira/browse/HBASE-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701629#comment-15701629 ] Anastasia Braginsky commented on HBASE-17081: --------------------------------------------- Thank you for your insights [~ram_krish]! bq. What I found was that with only flushing the tail anything more than 6 Do you mean with merges? Merging every 6 segments in pipeline and flushing tail (?) It is reasonable that you got "too many store files" then. It should not happen with composite snapshot. In average, every 4 in-memory-flushes there need to be flush-to-disk. Thus if THRESHOLD_PIPELINE_SEGMENTS is higher than 5, the merges should be rare, unless the entire system is in stress. bq. The one thing that could be a problem is that when we have scans then we need to scan 10 segments This JIRA is intended to provide a *mechanism of composite snapshot* without *optimizing the THRESHOLD_PIPELINE_SEGMENTS*. Under HBASE-16417, Eshcar is running experiments with infinite THRESHOLD_PIPELINE_SEGMENTS. We want to set THRESHOLD_PIPELINE_SEGMENTS to be infinite here if it doesn't cause any performance degradation. Then under HBASE-16417 we should come with really optimal policy, which is going to play with all the parameters. bq. What prompted you to ensure that flushing the entire pipeline is better than flushing only the tail as you were doing earlier? I think our concern was more on flusing tail only will create lot of small files mainly. Do you observe anyother thing when flushing only tail? Initially, with flattening only, we had too many open files, as you saw it yourself. When we introduced merge, you had reported some GC problems due to too many small indexes floating around. Additionally without composite snapshot the CompositeMemStore is never cleared upon single flush-to-disk, unless its active segment is empty since the previous flush-to-disk. Pay attention that without composite snapshot, upon flush-to-disk request you are pushing active to the pipeline and flushing the pipeline's tail only. So active is not flushed, unless it is empty. Thus in order to flush the entire CompositeMemStore to disk you need multiple flushes resulting in multiple files on disk, which is not desirable. So indeed the idea of truly emptying the store upon flush-to-disk looks good to us. > Flush the entire CompactingMemStore content to disk > --------------------------------------------------- > > Key: HBASE-17081 > URL: https://issues.apache.org/jira/browse/HBASE-17081 > Project: HBase > Issue Type: Sub-task > Reporter: Anastasia Braginsky > Assignee: Anastasia Braginsky > Attachments: HBASE-17081-V01.patch, HBASE-17081-V02.patch, HBASE-17081-V03.patch, Pipelinememstore_fortrunk_3.patch > > > Part of CompactingMemStore's memory is held by an active segment, and another part is divided between immutable segments in the compacting pipeline. Upon flush-to-disk request we want to flush all of it to disk, in contrast to flushing only tail of the compacting pipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)