hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
Date Mon, 17 Aug 2015 17:41:54 GMT

    [ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699908#comment-14699908
] 

Wangda Tan commented on YARN-4024:
----------------------------------

Hi [~zhiguohong],

Thanks for working on this,
For your comments:
bq. I think that's too complicated...
Agree, I changed my idea, see https://issues.apache.org/jira/browse/YARN-4024?focusedCommentId=14660607&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660607.

Approach in your patch general looks good to me, few suggestions:
1) When node becomes NODE_UNUSABLE/NODE_USABLE, I suggest remove them from the cache to force
update its ip, since a node status change will (likely) update its ip. So this may require
update the Resolver interface
2) seconds -> something like normalizedHostnameCacheTimeout

[~wilsoncraft],
bq. When a nodemanager is decommissioned, is the IP cached for that host flushed out of the
cache? Normally when a host gets a new IP its because it gets moved or some other deliberate
maintenance which would normally be preceded by a decommission. If the IP is flushed when
decommissioned or a IP is always resolved from the host name when a new or recommissioned
nodemanager is added to the cluster I think that would be adequate IMHO.
I'm not quite sure about what did you mean, does my comment solve the problem you meantioned?
bq. 1) When node becomes NODE_UNUSABLE/NODE_USABLE, I suggest remove them from the cache to
force update its ip, since a node status change will (likely) update its ip. So this may require
update the Resolver interface

bq. Also, it may be worthwhile or adequate to expose the method in a yarn rmadin command to
force a flush of the IP cache. Is this IP cache the same used for Rack Awareness by the RM?
I prefer keep this to be an internal behavior, this won't be used to determine rack IIUC.

Please let me know your thoughts.

Thanks,
Wangda

> YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
> ----------------------------------------------------------------------
>
>                 Key: YARN-4024
>                 URL: https://issues.apache.org/jira/browse/YARN-4024
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Wangda Tan
>            Assignee: Hong Zhiguo
>         Attachments: YARN-4024-draft.patch
>
>
> Currently, YARN RM NodesListManager will resolve IP address every time when node doing
heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message