hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8220) Running Tensorflow on YARN with GPU and Docker - Examples
Date Fri, 01 Jun 2018 16:33:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498199#comment-16498199

Eric Yang commented on YARN-8220:

[~sunilg] Thank you for the patch, a couple suggestions:

1. Avoid using bash style launch command.  Although this is kind of working, but it greatly
improves security and readability to use ENTRYPOINT, and CMD in Dockerfile.  For example:

WORKDIR /test/models/tutorials/image/cifar10_estimator 
ENTRYPOINT ["/usr/bin/python", "cifar10_main.py"]
CMD ["--data-dir=hdfs:///tmp/cifar-10-data"]
CMD ["--job-dir=hdfs:///tmp/cifar-10-jobdir"]
CMD ["--train-steps=10000"]
CMD ["--eval-batch-size=16"]
CMD ["--train-batch-size=16"]
CMD ["--sync"]
CMD ["--num-gpus=2"]

This simplifies yarnfile, and prevent to run the script in wrong directory if working directory
doesn't exist.

2. It might be good to show case some yarnfile features:

  "configuration": {
    "env": {

This helps to show case how to mount configuration files from host disks, and use ENTRYPOINT

3. Downloading source code from individual github contributors might be risky and prone to
break.  If the source is small enough and donated to Apache, it would be better to host them

> Running Tensorflow on YARN with GPU and Docker - Examples
> ---------------------------------------------------------
>                 Key: YARN-8220
>                 URL: https://issues.apache.org/jira/browse/YARN-8220
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn-native-services
>            Reporter: Sunil Govindan
>            Assignee: Sunil Govindan
>            Priority: Critical
>         Attachments: YARN-8220.001.patch
> Tensorflow could be run on YARN and could leverage YARN's distributed features.
> This spec fill will help to run Tensorflow on yarn with GPU/docker

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message