hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping
Date Fri, 16 Jan 2015 15:34:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280396#comment-14280396

Varun Saxena commented on YARN-3003:

[~Naganarasimha Garla],
Regarding your second point, I do not think memory footprint of storing NodeId in NodeLabel
class will be too much. We already have labels to NodeLabel mapping anyways. Let us assume
we have 100 node labels in all(even that is on the higher side I guess) and 4000 nodes. Even
if each node label is attached to every node we will have a set of 4000 NodeIds' in every
for every Node Label. NodeId merely stores a host and port(let us assume host is of length
20 bytes and port 4 bytes). As string normally allocates more memory than required, let us
assume each NodeId will occupy 50 bytes. This makes it additional memory of 100*4000*50 bytes
or around 20 MB. Even if we double it, this wont go upto more than 40-50 MB of additional

Now if we do not store Labels to Nodes mapping separately, we will have to iterate over all
the 4000 nodes(nodeCollections which is a ConcurrentHashMap). I do not think thats worth it
performance wise, if we consider that additional memory is just a few MB. I am not sure how
often the client will call {{getLabelsToNodes}} but as this is a public API we cant restrict
how client will behave. If the memory footprint was very high, it would have been a different

[~leftnoteasy] and [~tedyu] can comment on this as well.

> Provide API for client to retrieve label to node mapping
> --------------------------------------------------------
>                 Key: YARN-3003
>                 URL: https://issues.apache.org/jira/browse/YARN-3003
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: client, resourcemanager
>            Reporter: Ted Yu
>            Assignee: Varun Saxena
> Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set of labels
associated with the node.
> Client (such as Slider) may be interested in label to node mapping - given label, return
the nodes with this label.

This message was sent by Atlassian JIRA

View raw message