hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6720) Support updating FPGA related constraint node label after FPGA device re-configuration
Date Wed, 05 Jul 2017 21:18:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075443#comment-16075443

Wangda Tan commented on YARN-6720:


bq.  YARN-3409 Wouldn't be a blocker since this JIRA is a improvement of YARN-6507.
I'm not sure how to support device meta info in global (RM) scheduler without YARN-3409, I
couldn't find the answer from attached design doc. Could you explain what is the solution
in your mind?

Anyway I'm in favor of using a general approach which can be utilized by other features instead
of customize RM scheduler to support FPGA requirements. GPU support is more sensitive to GPU
type instead of firmware, but I can see docker support can be improved a lot if we can schedule
containers to a node which already has localized docker image.

> Support updating FPGA related constraint node label after FPGA device re-configuration
> --------------------------------------------------------------------------------------
>                 Key: YARN-6720
>                 URL: https://issues.apache.org/jira/browse/YARN-6720
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Zhankun Tang
>         Attachments: Storing-and-Updating-extra-FPGA-resource-attributes-in-hdfs_v1.pdf
> In order to provide a global optimal scheduling for mutable FPGA resource, it seems an
easy and direct way to utilize constraint node labels(YARN-3409) instead of extending the
global scheduler(YARN-3926) to match both resource count and attributes.
> The rough idea is that the AM sets the constraint node label expression to request containers
on the nodes whose FPGA devices has the matching IP, and then NM resource handler update the
node constraint label if there's FPGA device re-configuration.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message