Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E608D200C1D for ; Thu, 16 Feb 2017 21:41:45 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E4981160B61; Thu, 16 Feb 2017 20:41:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3B6F7160B52 for ; Thu, 16 Feb 2017 21:41:45 +0100 (CET) Received: (qmail 25924 invoked by uid 500); 16 Feb 2017 20:41:44 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 25913 invoked by uid 99); 16 Feb 2017 20:41:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2017 20:41:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id CE62B1A04E1 for ; Thu, 16 Feb 2017 20:41:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id tNpbpsCMZPml for ; Thu, 16 Feb 2017 20:41:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E7AD65F238 for ; Thu, 16 Feb 2017 20:41:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3DADAE05F0 for ; Thu, 16 Feb 2017 20:41:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A85AB2411F for ; Thu, 16 Feb 2017 20:41:41 +0000 (UTC) Date: Thu, 16 Feb 2017 20:41:41 +0000 (UTC) From: "huaxiang sun (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17172) Optimize mob compaction with _del files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 16 Feb 2017 20:41:46 -0000 [ https://issues.apache.org/jira/browse/HBASE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870654#comment-15870654 ] huaxiang sun commented on HBASE-17172: -------------------------------------- V2 patch addressed Ted's comments and warnings. > Optimize mob compaction with _del files > --------------------------------------- > > Key: HBASE-17172 > URL: https://issues.apache.org/jira/browse/HBASE-17172 > Project: HBase > Issue Type: Improvement > Components: mob > Affects Versions: 2.0.0 > Reporter: huaxiang sun > Assignee: huaxiang sun > Attachments: HBASE-17172-master-001.patch, HBASE-17172.master.001.patch, HBASE-17172.master.002.patch > > > Today, when there is a _del file in mobdir, with major mob compaction, every mob file will be recompacted, this causes lots of IO and slow down major mob compaction (may take months to finish). This needs to be improved. A few ideas are: > 1) Do not compact all _del files into one, instead, compact them based on groups with startKey as the key. Then use firstKey/startKey to make each mob file to see if the _del file needs to be included for this partition. > 2). Based on the timerange of the _del file, compaction for files after that timerange does not need to include the _del file as these are newer files. -- This message was sent by Atlassian JIRA (v6.3.15#6346)