Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0A118200B6B for ; Thu, 11 Aug 2016 02:54:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 05A3F160AB1; Thu, 11 Aug 2016 00:54:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 72170160AA4 for ; Thu, 11 Aug 2016 02:54:21 +0200 (CEST) Received: (qmail 44744 invoked by uid 500); 11 Aug 2016 00:54:20 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 44730 invoked by uid 99); 11 Aug 2016 00:54:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Aug 2016 00:54:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6D6E82C029E for ; Thu, 11 Aug 2016 00:54:20 +0000 (UTC) Date: Thu, 11 Aug 2016 00:54:20 +0000 (UTC) From: "binlijin (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 11 Aug 2016 00:54:22 -0000 [ https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416302#comment-15416302 ] binlijin commented on HBASE-16393: ---------------------------------- I know there is other place can improve by the same way, first improve StoreFileInfo#computeHDFSBlocksDistribution. > Improve computeHDFSBlocksDistribution > ------------------------------------- > > Key: HBASE-16393 > URL: https://issues.apache.org/jira/browse/HBASE-16393 > Project: HBase > Issue Type: Improvement > Reporter: binlijin > Attachments: HBASE-16393.patch > > > With our cluster is big, i can see the balancer is slow from time to time. And the balancer will be called on master startup, so we can see the startup is slow also. > The first thing i think whether if we can parallel compute different region's HDFSBlocksDistribution. > The second i think we can improve compute single region's HDFSBlocksDistribution. > When to compute a storefile's HDFSBlocksDistribution first we call FileSystem#getFileStatus(path) and then FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc call for every storefile. Instead we can use FileSystem#listLocatedStatus to get a LocatedFileStatus for the information we need, so reduce the namenode rpc call to one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)