Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 327EC182B1 for ; Mon, 26 Oct 2015 15:27:50 +0000 (UTC) Received: (qmail 87490 invoked by uid 500); 26 Oct 2015 15:27:28 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 87421 invoked by uid 500); 26 Oct 2015 15:27:28 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 87396 invoked by uid 99); 26 Oct 2015 15:27:27 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Oct 2015 15:27:27 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CB26C2C1F57 for ; Mon, 26 Oct 2015 15:27:27 +0000 (UTC) Date: Mon, 26 Oct 2015 15:27:27 +0000 (UTC) From: "Eshcar Hillel (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974383#comment-14974383 ] Eshcar Hillel commented on HBASE-13408: --------------------------------------- Following a comment in the Jira the compacted memstore configuration is now disconnected from the in-memory column family configuration setting; instead it can be set at the region server level by setting the memstore class name attribute. We are open to suggestions on how it would be best to set the memstore at each region, and specifically to add an additional column family attribute. In parallel, we should discuss the optimal way to push this branch into trunk after we've handled all major concerns that were raised so far. > HBase In-Memory Memstore Compaction > ----------------------------------- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature > Reporter: Eshcar Hillel > Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, InMemoryMemstoreCompactionEvaluationResults.pdf, InMemoryMemstoreCompactionMasterEvaluationResults.pdf, InMemoryMemstoreCompactionScansEvaluationResults.pdf, StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1. The data is kept in memory for as long as possible > 2. Memstore data is either compacted or in process of being compacted > 3. Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)