spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From viirya <...@git.apache.org>
Subject [GitHub] spark pull request #21267: [SPARK-21945][YARN][PYTHON] Make --py-files work ...
Date Tue, 08 May 2018 09:40:58 GMT
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21267#discussion_r186670486
  
    --- Diff: python/pyspark/context.py ---
    @@ -211,9 +211,23 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment,
batchSize,
             for path in self._conf.get("spark.submit.pyFiles", "").split(","):
                 if path != "":
                     (dirname, filename) = os.path.split(path)
    -                if filename[-4:].lower() in self.PACKAGE_EXTENSIONS:
    -                    self._python_includes.append(filename)
    -                    sys.path.insert(1, os.path.join(SparkFiles.getRootDirectory(), filename))
    +                try:
    +                    filepath = os.path.join(SparkFiles.getRootDirectory(), filename)
    +                    if not os.path.exists(filepath):
    +                        # In case of YARN with shell mode, 'spark.submit.pyFiles' files
are
    +                        # not added via SparkContext.addFile. Here we check if the file
exists,
    +                        # try to copy and then add it to the path. See SPARK-21945.
    +                        shutil.copyfile(path, filepath)
    +                    if filename[-4:].lower() in self.PACKAGE_EXTENSIONS:
    --- End diff --
    
    Am I missing anything? Looks like `PACKAGE_EXTENSIONS = ('.zip', '.egg', '.jar')`. So
`.py` seems not in that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message