spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-2975) SPARK_LOCAL_DIRS may cause problems when running in local mode
Date Sun, 17 Aug 2014 00:31:18 GMT

    [ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099841#comment-14099841
] 

Josh Rosen commented on SPARK-2975:
-----------------------------------

The driver's configuration properties seem to be propagated to the master, so it's probably
not safe to have SPARK_LOCAL_DIR override spark.local.dir by mutating the SparkConf when creating
SparkEnv, since this would cause a driver-local override to affect every machine on the cluster.

Is the set of local directories a property of workers' environments, like SPARK_HOME, in which
case we probably don't want to propagate it from drivers to workers?  Or is there a use-case
for allowing the driver to tell workers which local directories to use (maybe I want to create
two SparkContexts and configure them to use different disks or something)?

> SPARK_LOCAL_DIRS may cause problems when running in local mode
> --------------------------------------------------------------
>
>                 Key: SPARK-2975
>                 URL: https://issues.apache.org/jira/browse/SPARK-2975
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>            Priority: Critical
>
> If we're running Spark in local mode and {{SPARK_LOCAL_DIRS}} is set, the {{Executor}}
modifies SparkConf so that this value overrides {{spark.local.dir}}.  Normally, this is safe
because the modification takes place before SparkEnv is created.  In local mode, the Executor
uses an existing SparkEnv rather than creating a new one, so it winds up with a DiskBlockManager
that created local directories with the original {{spark.local.dir}} setting, but other components
attempt to use directories specified in the _new_ {{spark.local.dir}}, leading to problems.
> I discovered this issue while testing Spark 1.1.0-snapshot1, but I think it will also
affect Spark 1.0 (haven't confirmed this, though).
> (I posted some comments at https://github.com/apache/spark/pull/299#discussion-diff-15975800,
but also opening this JIRA so this isn't forgotten.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message