hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Vasudev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
Date Tue, 04 Aug 2015 20:39:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654319#comment-14654319
] 

Varun Vasudev commented on YARN-3926:
-------------------------------------

Thanks for comments [~jlowe] and [~asuresh]. My apologies for not responding earlier.

{quote}
As for RM and NM config mismatches, there can always be a problem where the RM is configured
to understand resources A, B, and C while the nodemanager is configured to provide A, B and
D. Handshaking during NM registration seems the appropriate way to mitigate this possibility,
although I'm not sure it's necessary to shutdown the NM if it is providing a superset of what
the RM schedules. Reading later in the doc it appears this is actually intended to be supported
by adding it to NMs then later the RM for rolling upgrades, but earlier it states that any
mismatch, even additional resources, is fatal to NM registration. That needs to be cleaned
up.
{quote}

I think most people feel that shutting down the NM is not a good idea. I'm going to go with
just printing out warning messages in the RM and NM. Does that seem ok?

{quote}
A little confused why the sample xml config has mappings of pf1,pf2, etc. to profile names
rather than using the profile names in the config properties directly like is done with the
concise format examples later.
{quote}

Good point. Arun had similar feedback. I'll change this.

{quote}
Overall seems like a reasonable approach to make handling of resource types data driven. I
have some performance concerns on the memory footprint impact of adding a Map to every resource
and needing to hash/compare strings every time we try to do any computations on it. The scheduler
loop is already too slow, and this looks like it could add significant overhead to it. Hopefully
we can mitigate that if it does become a concern, e.g.: translating Resource records coming
across the wire into an efficient internal representation optimized for the resource types
configured.
{quote}

I'll make sure to do some performance tests as part of the development.

{quote}
Instead of having to explicitly mark a resource as “countable”, can’t we just assume
thats the default and instead require “uncountable” types to be explicitly specified (once
we start supporting it)
{quote}

Fair point. I'll use this approach.

> Extend the YARN resource model for easier resource-type management and profiles
> -------------------------------------------------------------------------------
>
>                 Key: YARN-3926
>                 URL: https://issues.apache.org/jira/browse/YARN-3926
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Proposal for modifying resource model and profiles.pdf
>
>
> Currently, there are efforts to add support for various resource-types such as disk(YARN-2139),
network(YARN-2140), and  HDFS bandwidth(YARN-2681). These efforts all aim to add support for
a new resource type and are fairly involved efforts. In addition, once support is added, it
becomes harder for users to specify the resources they need. All existing jobs have to be
modified, or have to use the minimum allocation.
> This ticket is a proposal to extend the YARN resource model to a more flexible model
which makes it easier to support additional resource-types. It also considers the related
aspect of “resource profiles” which allow users to easily specify the various resources
they need for any given container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message