Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D64C3200B43 for ; Tue, 19 Jul 2016 22:57:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D5261160A5C; Tue, 19 Jul 2016 20:57:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 16C48160A8F for ; Tue, 19 Jul 2016 22:57:21 +0200 (CEST) Received: (qmail 98792 invoked by uid 500); 19 Jul 2016 20:57:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 98490 invoked by uid 99); 19 Jul 2016 20:57:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Jul 2016 20:57:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0DD6A2C0D56 for ; Tue, 19 Jul 2016 20:57:21 +0000 (UTC) Date: Tue, 19 Jul 2016 20:57:21 +0000 (UTC) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-16052) Improve HBaseFsck Scalability MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 19 Jul 2016 20:57:23 -0000 [ https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16052: --------------------------- Fix Version/s: 0.98.21 > Improve HBaseFsck Scalability > ----------------------------- > > Key: HBASE-16052 > URL: https://issues.apache.org/jira/browse/HBASE-16052 > Project: HBase > Issue Type: Improvement > Components: hbck > Reporter: Ben Lau > Assignee: Ben Lau > Fix For: 2.0.0, 1.4.0, 0.98.21 > > Attachments: HBASE-16052-0.98.v3.patch, HBASE-16052-master.patch, HBASE-16052-v3-0.98.patch, HBASE-16052-v3-branch-1.patch, HBASE-16052-v3-master.patch > > > There are some problems with HBaseFsck that make it unnecessarily slow especially for large tables or clusters with many regions. > This patch tries to fix the biggest bottlenecks and also include a couple of bug fixes for some of the race conditions caused by gathering and holding state about a live cluster that is no longer true by the time you use that state in Fsck processing. These race conditions cause Fsck to crash and become unusable on large clusters with lots of region splits/merges. > Here are some scalability/performance problems in HBaseFsck and the changes the patch makes: > - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and then discarding everything but the Paths, then passing the Paths to a PathFilter, and then having the filter look up the (previously discarded) FileStatuses of the paths again. This is actually worse than double I/O because the first lookup obtains a batch of FileStatuses while all the other lookups are individual RPCs performed sequentially. > -- Avoid this by adding a FileStatusFilter so that filtering can happen directly on FileStatuses > -- This performance bug affects more than Fsck, but also to some extent things like snapshots, hfile archival, etc. I didn't have time to look too deep into other things affected and didn't want to increase the scope of this ticket so I focus mostly on Fsck and make only a few improvements to other codepaths. The changes in this patch though should make it fairly easy to fix other code paths in later jiras if we feel there are some other features strongly impacted by this problem. > - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of Fsck runtime) and the running time scales with the number of store files, yet the function is completely serial > -- Make offlineReferenceFileRepair multithreaded > - LoadHdfsRegionDirs() uses table-level concurrency, which is a big bottleneck if you have 1 large cluster with 1 very large table that has nearly all the regions > -- Change loadHdfsRegionDirs() to region-level parallelism instead of table-level parallelism for operations. > The changes benefit all clusters but are especially noticeable for large clusters with a few very large tables. On our version of 0.98 with the original patch we had a moderately sized production cluster with 2 (user) tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)