hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14412) HostsFileReader#getHostDetails is very expensive on large clusters
Date Tue, 16 May 2017 04:58:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011741#comment-16011741

Rohith Sharma K S commented on HADOOP-14412:

I will commit trunk patch later of today if  no more objections. 
Branch-2-v2 patch jenkins has not triggered. 
Branch-2.8 patch looks good to me. 

> HostsFileReader#getHostDetails is very expensive on large clusters
> ------------------------------------------------------------------
>                 Key: HADOOP-14412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14412
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.8.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: HADOOP-14412.001.patch, HADOOP-14412.002.patch, HADOOP-14412-branch-2.001.patch,
HADOOP-14412-branch-2.002.patch, HADOOP-14412-branch-2.8.002.patch
> After upgrading one of our large clusters to 2.8 we noticed many IPC server threads of
the resourcemanager spending time in NodesListManager#isValidNode which in turn was calling
HostsFileReader#getHostDetails.  The latter is creating complete copies of the include and
exclude sets for every node heartbeat, and these sets are not small due to the size of the
cluster.  These copies are causing multiple resizes of the underlying HashSets being filled
and creating lots of garbage.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message