hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster
Date Tue, 14 Oct 2014 20:33:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171475#comment-14171475
] 

Jason Lowe commented on YARN-2314:
----------------------------------

The only issue I can think of is the idle timeout change that goes along with the cache being
disabled.  Since we disable the cache by default we also, by default, set the cm proxy connection
idle timeouts to zero. That means for each cm proxy RPC call we will create a new connection
to the NM.  That sounds expensive, and probably was the motivation for the creation of the
cache, but in practice it doesn't seem to matter (at least for the loads we tested which didn't
include Tez).  For our case we were comparing 2.x against 0.23, and 0.23 was slightly faster
in the AM scalability test than 2.x despite 2.x having this cache and 0.23 lacking it.

> ContainerManagementProtocolProxy can create thousands of threads for a large cluster
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-2314
>                 URL: https://issues.apache.org/jira/browse/YARN-2314
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: YARN-2314.patch, disable-cm-proxy-cache.patch, nmproxycachefix.prototype.patch
>
>
> ContainerManagementProtocolProxy has a cache of NM proxies, and the size of this cache
is configurable.  However the cache can grow far beyond the configured size when running on
a large cluster and blow AM address/container limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message