hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
Date Wed, 26 Apr 2017 13:44:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984830#comment-15984830
] 

Rohith Sharma K S commented on YARN-6523:
-----------------------------------------

+1 for the issue.  I am not sure why all the apps tokens are sent to NM rather than sending
only of applications which are running on that node. In 2nd and 3rd approaches has to deal
with renewal of credentials. It never be known that does credentials are renewed. But in 1st
approach, performance need to be compromised for node heartbeat response time. 

How about keeping app credentials in RMNodeImpl i.e proposal is let RMnodeImpl maintains copy
of credentials and these are sent in heartbeat response. By this way, RMnodeImpl maintains
app credential for running applications on node. In case of credential renewal, an event triggered
to RMnodeImpl to change its credentials. But, there would be corner cases, updated credentials
will misses for couple of heartbeats. cc:/ [~jlowe]

> RM requires large memory in sending out security tokens as part of Node Heartbeat in
large cluster
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6523
>                 URL: https://issues.apache.org/jira/browse/YARN-6523
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: RM
>    Affects Versions: 2.8.0, 2.7.3
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Critical
>
> Currently as part of heartbeat response RM sets all application's tokens though all applications
might not be active on the node. On top of it NodeHeartbeatResponsePBImpl converts tokens
for each app into SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with 8GB RAM
configured for RM



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message