spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-24384) spark-submit --py-files with .py files doesn't work in client mode before context initialization
Date Thu, 24 May 2018 18:19:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-24384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489524#comment-16489524
] 

Apache Spark commented on SPARK-24384:
--------------------------------------

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/21426

> spark-submit --py-files with .py files doesn't work in client mode before context initialization
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24384
>                 URL: https://issues.apache.org/jira/browse/SPARK-24384
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Spark Submit
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> In case the given Python file is .py file (zip file seems fine), seems the python path
is dynamically added after the context is got initialized.
> with this pyFile:
> {code}
> $ cat /home/spark/tmp.py
> def testtest():
>     return 1
> {code}
> This works:
> {code}
> $ cat app.py
> import pyspark
> pyspark.sql.SparkSession.builder.getOrCreate()
> import tmp
> print("************************%s" % tmp.testtest())
> $ ./bin/spark-submit --master yarn --deploy-mode client --py-files /home/spark/tmp.py
app.py
> ...
> ************************1
> {code}
> but this doesn't:
> {code}
> $ cat app.py
> import pyspark
> import tmp
> pyspark.sql.SparkSession.builder.getOrCreate()
> print("************************%s" % tmp.testtest())
> $ ./bin/spark-submit --master yarn --deploy-mode client --py-files /home/spark/tmp.py
app.py
> Traceback (most recent call last):
>   File "/home/spark/spark/app.py", line 2, in <module>
>     import tmp
> ImportError: No module named tmp
> {code}
> See https://issues.apache.org/jira/browse/SPARK-21945?focusedCommentId=16488486&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16488486



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message