hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
Date Wed, 03 May 2017 14:46:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994995#comment-15994995
] 

Naganarasimha G R edited comment on YARN-6523 at 5/3/17 2:45 PM:
-----------------------------------------------------------------

Sorry for the delay in response [~jlowe],
Thanks for the very detailed response. Agree that the delta approaches initially mentioned
can introduce certain amount of complexity in the cases mentioned by you.
Though initially the approach mentioned by you was appealing and less complicated, i was thinking
of following scenarios :
# When there are large number of small jobs in a large clsuter we almost send all the tokens
all the time as the sequence keeps increasing when more and more jobs get submitted.
# Well we are doing interface modification, so it would be better to go for complete solution
so that its not revisited again for deprecation.

One other approach which i can think of is : Send all the tokens during node registration
( This will avoid most of the corner cases) and as part of heartbeat send the app tokens(all)
which have been renewed (which can be done in event based model). Further we can have the
cache(pre-computed) of SystemCredentialsForAppsProto which are sent as part of Heart Beat
so that we reduce memory foot print. thus this approach would solve large number of small
jobs too without interface change. thoughts ?


was (Author: naganarasimha):
Sorry for the delay in response [~jlowe],
Thanks for the very detailed response. Agree that the delta approaches initially mentioned
can introduce certain amount of complexity in the cases mentioned by you.
Though initially the approach mentioned by you was appealing and less complicated, i was thinking
of following scenarios :
# When there are large number of small jobs in a large clsuter we almost send the tokens as
the sequence keeps increasing when more and more jobs get submitted.
# Well we are doing interface modification, so it would be better to go for complete solution
so that its not revisited again for deprecation.

One other approach which i can think of is : Send all the tokens during node registration
( This will avoid most of the corner cases) and as part of heartbeat send the app tokens(all)
which have been renewed (which can be done in event based model). Further we can have the
cache(pre-computed) of SystemCredentialsForAppsProto which are sent as part of Heart Beat
so that we reduce memory foot print. thus this approach would solve large number of small
jobs too without interface change. thoughts ?

> RM requires large memory in sending out security tokens as part of Node Heartbeat in
large cluster
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6523
>                 URL: https://issues.apache.org/jira/browse/YARN-6523
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: RM
>    Affects Versions: 2.8.0, 2.7.3
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Critical
>
> Currently as part of heartbeat response RM sets all application's tokens though all applications
might not be active on the node. On top of it NodeHeartbeatResponsePBImpl converts tokens
for each app into SystemCredentialsForAppsProto. Hence for each node and each heartbeat too
many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with 8GB RAM
configured for RM



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message