spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JoshRosen <>
Subject [GitHub] spark pull request: [SPARK-2713] Executors of same application in ...
Date Wed, 03 Sep 2014 22:03:13 GMT
Github user JoshRosen commented on a diff in the pull request:
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -317,13 +317,58 @@ private[spark] object Utils extends Logging {
    +   * Copy cached file to targetDir, if not exists, download it from url firstly.
    --- End diff --
    Minor nitpick on naming, but I think it's confusing to have a method named `fetchCachedFile`
with an option that has to be explicitly set in order to use the cache.  I'd prefer to name
this `fetchFile`, and rename the other method to something like `doFetchFile` or `_fetchFile`.
    When fixing the merge conflict, do you mind moving the comment from the old `fetchFile`
to here?  I think the most comprehensive documentation should be on the public function, not
the private one.  I'd say something like
        * Download a file requested by the executor . Supports fetching the file in a variety
of ways,
        * including HTTP, HDFS and files on a standard filesystem, based on the URL parameter.
        * If `useCache` is true, first attempts to fetch the file from a local cache that's
shared across
        * executors running the same application.
        * Throws SparkException if the target file already exists and has different contents
        * the requested file.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message