hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries
Date Wed, 02 Mar 2016 22:49:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176644#comment-15176644

Karthik Kambatla commented on YARN-4719:

[~leftnoteasy] - thanks for chiming in, you make some valid points. 

Since we are building a library for node tracking, I would like for us to restrict access
to the map/set of nodes tracked only through addNode and removeNode so total_cluster_resources,
total_inflated_cluster_resources (for YARN-1011), max_cluster_resources are not affected by
other scheduler code. Do you think this is a reasonable goal? At least, as long as it doesn't
hurt performance?

If yes, we should decide on how to handle cases where the scheduler code needs to iterate
through the nodes: (1) we could provide a snapshot copy of the map/set of nodes/nodeIds, or
(2) provide a way to do the same with the right locks by adding additional methods or an abstraction
(similar to lambdas) that applies to multiple methods. 


PS: By the way, thanks for pointing out the javadoc for values(). I will clean that up based
on the discussion output here.

> Add a helper library to maintain node state and allows common queries
> ---------------------------------------------------------------------
>                 Key: YARN-4719
>                 URL: https://issues.apache.org/jira/browse/YARN-4719
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.8.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>         Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch
> The scheduler could use a helper library to maintain node state and allowing matching/sorting
queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each scheduler. Having
a single library will take us that much closer to reducing duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality significantly.

> # An API that returns a sorted list for a custom comparator would help YARN-1011 where
we want to sort by allocation and utilization for continuous/asynchronous and opportunistic
scheduling respectively. 

This message was sent by Atlassian JIRA

View raw message