Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 68F0E200BCB for ; Thu, 24 Nov 2016 09:31:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 671B8160B20; Thu, 24 Nov 2016 08:31:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B2219160B11 for ; Thu, 24 Nov 2016 09:31:00 +0100 (CET) Received: (qmail 95312 invoked by uid 500); 24 Nov 2016 08:30:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 95274 invoked by uid 99); 24 Nov 2016 08:30:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 08:30:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A88352C03DE for ; Thu, 24 Nov 2016 08:30:59 +0000 (UTC) Date: Thu, 24 Nov 2016 08:30:59 +0000 (UTC) From: "Jingcheng Du (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HBASE-17172) Optimize major mob compaction with _del files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 24 Nov 2016 08:31:01 -0000 [ https://issues.apache.org/jira/browse/HBASE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15692632#comment-15692632 ] Jingcheng Du edited comment on HBASE-17172 at 11/24/16 8:29 AM: ---------------------------------------------------------------- Thanks Huaxiang! bq. users may still choose to disable mob compaction chore and run mob compaction manually at scheduled maintenance. Right, how about to run minor compaction instead? It doesn't make sense to run major mob compaction periodically. Mob is designed to reduce the IO amplification during compaction. Major compaction will break this. bq. To keep delete marker in hbase files in mob-enabled cf is one way to avoid .del files, the concern is that it is inconsistent with non-mob cfs (maybe this can be provided as option through config?). Hmm. If the .del is not a performance killer, we don't need this. I reviewed the code, I think the .del files is not the reason of the slow compaction, major compaction itself is. bq. With the current major mob compaction, these .del files will be included in compacting of files for other regions which is not necessary. Right, it is not necessary. To split them by regions is a good choice. But is this necessary if .del file didn't impact the compaction performance badly? was (Author: jingcheng.du@intel.com): Thanks Huaxiang! bq. users may still choose to disable mob compaction chore and run mob compaction manually at scheduled maintenance. Right, how about to run minor compaction instead? It doesn't make sense to run major mob compaction periodically. Mob is designed to reduce the IO amplification during compaction. Major compaction will break this. bq. To keep delete marker in hbase files in mob-enabled cf is one way to avoid .del files, the concern is that it is inconsistent with non-mob cfs (maybe this can be provided as option through config?). Hmm. If the .del is not a performance killer, we don't need this. I reviewed the code, I think the .del files is not the reason of the slow compaction, major compaction itself is. bq. With the current major mob compaction, these .del files will be included in compacting of files for other regions which is not necessary. Right, it is not necessary. To split them by regions is an good choice. But is this necessary if .del file didn't impact the compaction performance badly? > Optimize major mob compaction with _del files > --------------------------------------------- > > Key: HBASE-17172 > URL: https://issues.apache.org/jira/browse/HBASE-17172 > Project: HBase > Issue Type: Improvement > Components: mob > Affects Versions: 2.0.0 > Reporter: huaxiang sun > Assignee: huaxiang sun > > Today, when there is a _del file in mobdir, with major mob compaction, every mob file will be recompacted, this causes lots of IO and slow down major mob compaction (may take months to finish). This needs to be improved. A few ideas are: > 1) Do not compact all _del files into one, instead, compact them based on groups with startKey as the key. Then use firstKey/startKey to make each mob file to see if the _del file needs to be included for this partition. > 2). Based on the timerange of the _del file, compaction for files after that timerange does not need to include the _del file as these are newer files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)