spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lianhuiwang <>
Subject [GitHub] spark pull request: [SPARK-5173]support python application running...
Date Sun, 01 Feb 2015 06:31:19 GMT
GitHub user lianhuiwang reopened a pull request:

    [SPARK-5173]support python application running on yarn cluster mode

    now when we run python application on yarn cluster mode through spark-submit, spark-submit
does not support python application on yarn cluster mode. so i modify code of submit and yarn's
AM in order to support it.
    through specifying .py file or primaryResource file via spark-submit, we can make pyspark
run in yarn-cluster mode.
    example:spark-submit --master yarn-master --num-executors 1 --driver-memory 1g --executor-memory
1g --primaryResource yy.conf
    this config is same as pyspark on yarn-client mode.
    firstly,we put local path of .py or primaryResource to yarn's dist.files.that can be distributed
on slave nodes.and then in spark-submit we transfer --py-files and --primaryResource to yarn.Client
and use "org.apache.spark.deploy.PythonRunner" to user class that can run .py files on ApplicationMaster.
    in yarn.Client we transfer --py-files and --primaryResource to  ApplicationMaster.
    in ApplicationMaster, user's class is org.apache.spark.deploy.PythonRunner, and user's
args is primaryResource and -py-files. so that can make pyspark run on ApplicationMaster.
    @JoshRosen @tgravescs @sryza

You can merge this pull request into a Git repository by running:

    $ git pull SPARK-5173

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3976
commit 9c941bc59527e594ee1d155c00cb8e55d7c40fe8
Author: lianhuiwang <>
Date:   2015-01-09T12:58:24Z

    support python application running on yarn cluster mode

commit 172eec10b9daaf9ed838e821474d28871ab63462
Author: Wang Lianhui <>
Date:   2015-01-09T15:01:52Z

    fix a min submit's bug

commit f1f55b6eb4b65499be8e182e857d89a158873234
Author: lianhuiwang <>
Date:   2015-01-29T11:13:35Z

    when yarn-cluster, all python files can be non-local

commit 905a10610532578c774e58d12b927597330fb9ff
Author: lianhuiwang <>
Date:   2015-01-31T03:29:09Z

    update with sryza and andrewor 's comments

commit 097a5ec37456bf9d13a952f4108a750b9f9f84d0
Author: lianhuiwang <>
Date:   2015-01-31T03:59:06Z

    fix line length exceeds 100

commit 5b300648fe53d9de604e8afce7580fddfe6bbaef
Author: lianhuiwang <>
Date:   2015-01-31T12:18:22Z

    add test

commit d60bc6069cf65637622472ef1cd27153333df53c
Author: lianhuiwang <>
Date:   2015-01-31T14:07:03Z

    fix test

commit 2adc8f591ddd0f253496c18d32b1910d29e04c8d
Author: lianhuiwang <>
Date:   2015-01-31T16:35:01Z

    add spark.test.home

commit 47d2fc35e53a8851790607085bc67e94736358d6
Author: lianhuiwang <>
Date:   2015-02-01T02:40:25Z

    fix test


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message