hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups
Date Wed, 13 Sep 2017 18:16:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165065#comment-16165065
] 

Wangda Tan commented on YARN-6852:
----------------------------------

[~tangzhankun],

Thanks for adding ref to K8S ongoing proposals.

I just quickly read both proposals, to me the hw-accelerator looks like a long term goal can
be done 1-2 years later. IMHO, the usage of hw-accelerator on such platforms (K8S/YARN) are
still in early phase, people are trying to move some workload from bare-metal or HPC to these
platforms. It becomes important requirement once more workload needs GPU/FPGA landed. We can
either do some non-intrusive changes like adding node attribute for device types / versions,
or more comprehensive changes to support topology, etc. To me the first option will be straightforward,
the 2nd option is not only a challenge for device isolation, it also changes how application
asks resource, and how scheduler deal with asks. The k8s proposal to solve the scheduling
problem looks too simple to me, it won't fit in YARN's scheduling performance requirement.

For the device manager, it will be a nice-to-have feature, I will think more about it while
working on YARN-6620. K8S proposal is very flexible to add new resource type but it is also
very heavy-weighted. For example, different resource plugins need to implement their own logics
to store state, etc. And managing plugin might be a challenge for today's YARN.

> [YARN-6223] Native code changes to support isolate GPU devices by using CGroups
> -------------------------------------------------------------------------------
>
>                 Key: YARN-6852
>                 URL: https://issues.apache.org/jira/browse/YARN-6852
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>             Fix For: 3.0.0-beta1
>
>         Attachments: YARN-6852.001.patch, YARN-6852.002.patch, YARN-6852.003.patch, YARN-6852.004.patch,
YARN-6852.005.patch, YARN-6852.006.patch, YARN-6852.007.patch, YARN-6852.008.patch, YARN-6852.009.patch
>
>
> This JIRA plan to add support of:
> 1) Isolation in CGroups. (native side).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message