Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AEF90200B49 for ; Mon, 4 Jul 2016 11:37:13 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id ADEA7160A6D; Mon, 4 Jul 2016 09:37:13 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6B6B9160A55 for ; Mon, 4 Jul 2016 11:37:12 +0200 (CEST) Received: (qmail 17220 invoked by uid 500); 4 Jul 2016 09:37:11 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 17076 invoked by uid 99); 4 Jul 2016 09:37:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jul 2016 09:37:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 25F222C02AB for ; Mon, 4 Jul 2016 09:37:11 +0000 (UTC) Date: Mon, 4 Jul 2016 09:37:11 +0000 (UTC) From: "Anoop Sam John (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16162) Compacting Memstore : unnecessary push of active segments to pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 04 Jul 2016 09:37:13 -0000 [ https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361092#comment-15361092 ] Anoop Sam John commented on HBASE-16162: ---------------------------------------- Pls refer to patch xxx_V4.patch.. That is the numbering scheme we usually follow :-) This patch is bit diff than what u pasted above. inMemoryFlushInProgress CAS happens in shouldFlushInMemory() only. Within flushInMemory() the atomic boolean is just set to true (for test cases).. That am doing at 1st not after push to pipeline. This is because the variable responsibility is not just compaction now.. Also the reset to false happens in finally block not in MemstoreCompactor , as u suggested. > Compacting Memstore : unnecessary push of active segments to pipeline > --------------------------------------------------------------------- > > Key: HBASE-16162 > URL: https://issues.apache.org/jira/browse/HBASE-16162 > Project: HBase > Issue Type: Sub-task > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Priority: Critical > Attachments: HBASE-16162.patch, HBASE-16162_V2.patch, HBASE-16162_V3.patch, HBASE-16162_V4.patch > > > We have flow like this > {code} > protected void checkActiveSize() { > if (shouldFlushInMemory()) { > InMemoryFlushRunnable runnable = new InMemoryFlushRunnable(); > } > getPool().execute(runnable); > } > } > private boolean shouldFlushInMemory() { > if(getActive().getSize() > inmemoryFlushSize) { > // size above flush threshold > return (allowCompaction.get() && !inMemoryFlushInProgress.get()); > } > return false; > } > void flushInMemory() throws IOException { > // Phase I: Update the pipeline > getRegionServices().blockUpdates(); > try { > MutableSegment active = getActive(); > pushActiveToPipeline(active); > } finally { > getRegionServices().unblockUpdates(); > } > // Phase II: Compact the pipeline > try { > if (allowCompaction.get() && inMemoryFlushInProgress.compareAndSet(false, true)) { > // setting the inMemoryFlushInProgress flag again for the case this method is invoked > // directly (only in tests) in the common path setting from true to true is idempotent > // Speculative compaction execution, may be interrupted if flush is forced while > // compaction is in progress > compactor.startCompaction(); > } > {code} > So every write of cell will produce the check checkActiveSize(). When we are at border of in mem flush, many threads doing writes to this memstore can get this checkActiveSize () to pass. Yes the AtomicBoolean is still false only. It is turned ON after some time once the new thread is started run and it push the active to pipeline etc. > In the new thread code of inMemFlush, we dont have any size check. It just takes the active segment and pushes that to pipeline. Yes we dont allow any new writes to memstore at this time. But before that write lock on region, other handler thread also might have added entry to this thread pool. When the 1st one finishes, it releases the lock on region and handler threads trying for write to memstore, might get lock and add some data. Now this 2nd in mem flush thread may get a chance and get the lock and so it just takes current active segment and flush that in memory ! This will produce very small sized segments to pipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)