Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7A4A5173BF for ; Thu, 29 Jan 2015 08:50:34 +0000 (UTC) Received: (qmail 59186 invoked by uid 500); 29 Jan 2015 08:50:34 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 59139 invoked by uid 500); 29 Jan 2015 08:50:34 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 59127 invoked by uid 99); 29 Jan 2015 08:50:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jan 2015 08:50:34 +0000 Date: Thu, 29 Jan 2015 08:50:34 +0000 (UTC) From: "Jingcheng Du (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11861) Native MOB Compaction mechanisms. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296596#comment-14296596 ] Jingcheng Du commented on HBASE-11861: -------------------------------------- Think about the thread pool thing in the mob file compaction. As you know, we have to divide the mob files to batches if there're two many candidates in the mob file compaction (native compaction), and it's 100 as default. If we have multiple threads to do the compaction in chore, we have to reduce the batch limitation, for instance 10 ( from 100). So that is less efficient in the mob compaction after the thread pool is used. Previously in one thread, we have 100 files merged to 1, After the pool is used, we have 10 files merged to 1. Maybe we could do the compaction in parallel in future ( to dispatch the compaction the HRS). But for now we could have one thread to handle that. Please advise. Thanks. > Native MOB Compaction mechanisms. > --------------------------------- > > Key: HBASE-11861 > URL: https://issues.apache.org/jira/browse/HBASE-11861 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners > Affects Versions: 2.0.0 > Reporter: Jonathan Hsieh > Assignee: Jingcheng Du > Attachments: 141030-mob-compaction.pdf, HBASE-11861-V1.diff, HBASE-11861-V2.diff, HBASE-11861.diff, mob compaction-out-of-region.pdf, mob compaction.pdf > > > Currently, the first cut of mob will have external processes to age off old mob data (the ttl cleaner), and to compact away deleted or over written data (the sweep tool). > From an operational point of view, having two external tools, especially one that relies on MapReduce is undesirable. In this issue we'll tackle integrating these into hbase without requiring external processes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)