Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 16 May 2017 13:12:04 +0000 (UTC)
From: "Jason Lowe (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.13071224.1494517942000.220769.1494940324817@Atlassian.JIRA>
In-Reply-To: <JIRA.13071224.1494517942000@Atlassian.JIRA>
References: <JIRA.13071224.1494517942000@Atlassian.JIRA> <JIRA.13071224.1494517942046@jira-lw-us.apache.org>
Subject: [jira] [Updated] (HADOOP-14412) HostsFileReader#getHostDetails is
 very expensive on large clusters
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Tue, 16 May 2017 13:12:11 -0000


     [ https://issues.apache.org/jira/browse/HADOOP-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated HADOOP-14412:
--------------------------------
    Attachment: HADOOP-14412-branch-2.002.patch

Uploading the same branch-2 patch again to trigger a Jenkins run.

> HostsFileReader#getHostDetails is very expensive on large clusters
> ------------------------------------------------------------------
>
>                 Key: HADOOP-14412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14412
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.8.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: HADOOP-14412.001.patch, HADOOP-14412.002.patch, HADOOP-14412-branch-2.001.patch, HADOOP-14412-branch-2.002.patch, HADOOP-14412-branch-2.002.patch, HADOOP-14412-branch-2.8.002.patch
>
>
> After upgrading one of our large clusters to 2.8 we noticed many IPC server threads of the resourcemanager spending time in NodesListManager#isValidNode which in turn was calling HostsFileReader#getHostDetails.  The latter is creating complete copies of the include and exclude sets for every node heartbeat, and these sets are not small due to the size of the cluster.  These copies are causing multiple resizes of the underlying HashSets being filled and creating lots of garbage.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org