spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <>
Subject [jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode
Date Thu, 24 May 2018 16:48:00 GMT


Hyukjin Kwon commented on SPARK-21945:

To be more correct, the paths are added as are given my investigation so far. It's fine for
zip archive but for .py file the paths shouldn't be added as are (but its parent directory)
so for py files, yes, we should copy them too.

It's weird but I think this is all because we happened to support .py file in the same option
whereas PYTHONPATH doesn't expect a file.

> pyspark --py-files doesn't work in yarn client mode
> ---------------------------------------------------
>                 Key: SPARK-21945
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.2.0
>            Reporter: Thomas Graves
>            Assignee: Hyukjin Kwon
>            Priority: Major
>             Fix For: 2.3.1, 2.4.0
> I tried running pyspark with --py-files  but it doesn't properly add
the zip file to the PYTHONPATH.
> I can work around by exporting PYTHONPATH.
> Looking in SparkSubmitCommandBuilder.buildPySparkShellCommand  I don't see this supported
at all.   If that is the case perhaps it should be moved to improvement.
> Note it works via spark-submit in both client and cluster mode to run python script.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message