hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhankun Tang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5983) [Umbrella] Support for FPGA as a Resource in YARN
Date Fri, 28 Apr 2017 03:55:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988143#comment-15988143

Zhankun Tang commented on YARN-5983:

[~wangda], Thanks for the review.
Yes, quite agree that YARN-3409 is a supplement to YARN-3926 for better RM scheduling(scheduling
preference) but not related to consumable resource.
Sorry for my misleading "exclusive" concept. Regarding to exclusive/non-exclusive resource,
maybe *exclusive/non-exclusive consumable resource* is better?
IMO, the consumable resource can be classified into two categories. One type is "pooled resource"
like CPU, memory, bandwith and blkio which are abstracted by the OS. Whereas the other type
is "not-easy/possible-to-share resource" like GPU, FPGA, SSD disk or network port. These resources
are going to be first-class citizen but needs more assist in YARN to be exposed to applications.

For the unclear sentences in design doc:
No NM side resource management of FPGA resource. For instance, dynamically resource discovery,
monitoring and preparation before container launch
-> The NM should be able to discover the FPGA devices automatically and save their attributes
to some storage. Given an allocated container, NM should download the requested IP and flash
it on to the scheduled device. NM also needs to do health check on the FPGA device and the
process that is using it.

AM set the IP UUID/name in container environment and sends requests to NM to launch the allocated
-> As mentioned in our offline meeting, the user has to provide the desired IP ID to be
flashed on the FPGA device to AM and then set it into environment.

> [Umbrella] Support for FPGA as a Resource in YARN
> -------------------------------------------------
>                 Key: YARN-5983
>                 URL: https://issues.apache.org/jira/browse/YARN-5983
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: yarn
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>         Attachments: YARN-5983-Support-FPGA-resource-on-NM-side_v1.pdf
> As various big data workload running on YARN, CPU will no longer scale eventually and
heterogeneous systems will become more important. ML/DL is a rising star in recent years,
applications focused on these areas have to utilize GPU or FPGA to boost performance. Also,
hardware vendors such as Intel also invest in such hardware. It is most likely that FPGA will
become popular in data centers like CPU in the near future.
> So YARN as a resource managing and scheduling system, would be great to evolve to support
this. This JIRA proposes FPGA to be a first-class citizen. The changes roughly includes:
> 1. FPGA resource detection and heartbeat
> 2. Scheduler changes
> 3. FPGA related preparation and isolation before launch container
> We know that YARN-3926 is trying to extend current resource model. But still we can leave
some FPGA related discussion here

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message