hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster
Date Wed, 23 Jul 2014 13:58:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071726#comment-14071726
] 

Jason Lowe commented on YARN-2314:
----------------------------------

I suppose we could use a wait timeout.  I was just matching the behavior when it tries to
refresh the NM token on an in-use proxy which also waits indefinitely.  What's the proposed
behavior when the timeout expires?  Log a message and then...?  Arguably the timeouts should
be on the RPC calls rather than the proxy cache, since I'm assuming if we're not willing to
wait forever for a proxy to be freed up we're also not willing to wait forever for a remote
call to complete.

> ContainerManagementProtocolProxy can create thousands of threads for a large cluster
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-2314
>                 URL: https://issues.apache.org/jira/browse/YARN-2314
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Priority: Critical
>         Attachments: nmproxycachefix.prototype.patch
>
>
> ContainerManagementProtocolProxy has a cache of NM proxies, and the size of this cache
is configurable.  However the cache can grow far beyond the configured size when running on
a large cluster and blow AM address/container limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message