hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
Date Tue, 10 Oct 2017 16:51:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198947#comment-16198947

Wangda Tan commented on YARN-6620:

Thanks for the review, [~sunilg]. 

For #1, The JIRA is YARN-7159. I'm not sure if it is a valid use case for client to define
different unit / type for resource-types.xml, and in YARN-7307 we're considering to remove
client-side resource-type.xml. Let's discuss more on the related JIRAs.

For #2, It might be better to rename "mandatory" to "first-class", so once FPGA support added
to YARN, it is a first-class resource instead of mandatory resources. I think it might be
valuable to have a list of first class resources supported by YARN, and we don't allow client
to change core properties (includes unit and type, not include min/max value). With this we
can have better out-of-box user experiences because most use cases should be addressed by
first-class resources.

To me NUMA is a little different here, today we don't support NUMA as a separate resource,
instead it is additional affinity for known resource types (cpu/memory). So before we want
to support centralized NUMA allocation in RM/scheduler, we don't need to model NUMA as a separate
resource type. And once we need to do that, some refactoring need to be done to existing resource
api since it cannot describe hierarchy and relations between resource units.

> [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
> -------------------------------------------------------------------------------------
>                 Key: YARN-6620
>                 URL: https://issues.apache.org/jira/browse/YARN-6620
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-6620.001.patch, YARN-6620.002.patch, YARN-6620.003.patch, YARN-6620.004.patch,
YARN-6620.005.patch, YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, YARN-6620.009.patch,
YARN-6620.010.patch, YARN-6620.011.patch, YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch,
YARN-6620.015.patch, YARN-6620.016.patch, YARN-6620.017.patch
> This JIRA plan to add support of:
> 1) GPU configuration for NodeManagers
> 2) Isolation in CGroups. (Java side).
> 3) NM restart and recovery allocated GPU devices

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message