hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8925) Updating distributed node attributes only when necessary
Date Sun, 18 Nov 2018 15:08:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690924#comment-16690924

Weiwei Yang commented on YARN-8925:

Hi [~Tao Yang]

I took sometime tested this patch with trunk build. I used configuration based node-attribute

Initially I set node-attributes as following
then I start RM/NM, from RM log I see
2018-11-18 14:44:37,365 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
NodeManager from node ip-172-31-15-90.us-east-2.compute.internal(cmPort: 37873 httpPort: 8042)
registered with capability: <memory:8192, vCores:8>, assigned nodeId ip-172-31-15-90.us-east-2.compute.internal:37873,
node attributes \{ [nm.yarn.io/osType(STRING)=redhat, nm.yarn.io/osVersion(STRING)=2.6] }
it seems node-attributes are correctly registered.

then on NM logs, I kept seeing following error message
2018-11-18 14:44:37,443 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
NM node attributes \{[nm.yarn.io/osType(STRING)=redhat, nm.yarn.io/osVersion(STRING)=2.6]}
were not accepted by RM and message from RM : null
since there is no update, I suppose in this case, NM should report null to RM and RM just
simply ignore the message right?

Then I update the node-attributes by changing the value of the property
then I expect to see this update should be reported back to RM and RM should accept this change,
however, I am still seeing the log
2018-11-18 14:56:36,771 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
NM node attributes \{[nm.yarn.io/osType(STRING)=redhat, nm.yarn.io/osVersion(STRING)=2.7]}
were not accepted by RM and message from RM : null
it doesn't look like to be working as expected, could you pls take a look?


> Updating distributed node attributes only when necessary
> --------------------------------------------------------
>                 Key: YARN-8925
>                 URL: https://issues.apache.org/jira/browse/YARN-8925
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.2.1
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>              Labels: performance
>         Attachments: YARN-8925.001.patch, YARN-8925.002.patch, YARN-8925.003.patch, YARN-8925.004.patch,
YARN-8925.005.patch, YARN-8925.006.patch, YARN-8925.007.patch
> Currently if distributed node attributes exist, even though there is no change, updating
for distributed node attributes will happen in every heartbeat between NM and RM. Updating
process will hold NodeAttributesManagerImpl#writeLock and may have some influence in a large
cluster. We have found nodes UI of a large cluster is opened slowly and most time it's waiting
for the lock in NodeAttributesManagerImpl. I think this updating should be called only when
necessary to enhance the performance of related process.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message