hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
Date Thu, 23 Jul 2015 20:47:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639466#comment-14639466

Jason Lowe commented on YARN-3926:

Thanks for creating the proposal, Varun!  Some quick comments after a brief review:

xinclude is a simple solution for supporting both a monolithic yarn-site.xml or a separate
file if we stick with the Configuration-based approach.  Code loads yarn-site.xml but users
can always separate out chunks of it and xinclude them.  We do this quite a bit with our configs

As for RM and NM config mismatches, there can always be a problem where the RM is configured
to understand resources A, B, and C while the nodemanager is configured to provide A, B and
D.  Handshaking during NM registration seems the appropriate way to mitigate this possibility,
although I'm not sure it's necessary to shutdown the NM if it is providing a superset of what
the RM schedules.  Reading later in the doc it appears this is actually intended to be supported
by adding it to NMs then later the RM for rolling upgrades, but earlier it states that any
mismatch, even additional resources, is fatal to NM registration.  That needs to be cleaned

A little confused why the sample xml config has mappings of pf1,pf2, etc. to profile names
rather than using the profile names in the config properties directly like is done with the
concise format examples later.  For example, couldn't it be simplified to:
That being said I think the sample configs at the end, particularly the json form or potentially
a yaml version, would be a welcome sight for those trying to setup and grok the configs.

The sample config in the beginning has a typo, yarn.nodemanager.resource-types.cpu s/b yarn.nodemanager.resource-types.cpu.name.

Overall seems like a reasonable approach to make handling of resource types data driven. 
I have some performance concerns on the memory footprint impact of adding a Map to every resource
and needing to hash/compare strings every time we try to do any computations on it.  The scheduler
loop is already too slow, and this looks like it could add significant overhead to it.  Hopefully
we can mitigate that if it does become a concern, e.g.: translating Resource records coming
across the wire into an efficient internal representation optimized for the resource types

> Extend the YARN resource model for easier resource-type management and profiles
> -------------------------------------------------------------------------------
>                 Key: YARN-3926
>                 URL: https://issues.apache.org/jira/browse/YARN-3926
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Proposal for modifying resource model and profiles.pdf
> Currently, there are efforts to add support for various resource-types such as disk(YARN-2139),
network(YARN-2140), and  HDFS bandwidth(YARN-2681). These efforts all aim to add support for
a new resource type and are fairly involved efforts. In addition, once support is added, it
becomes harder for users to specify the resources they need. All existing jobs have to be
modified, or have to use the minimum allocation.
> This ticket is a proposal to extend the YARN resource model to a more flexible model
which makes it easier to support additional resource-types. It also considers the related
aspect of “resource profiles” which allow users to easily specify the various resources
they need for any given container.

This message was sent by Atlassian JIRA

View raw message