spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Field (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-15371) YARNShuffleService doesn't get current local-dirs from NodeManager
Date Tue, 17 May 2016 22:00:14 GMT
Jeff Field created SPARK-15371:
----------------------------------

             Summary: YARNShuffleService doesn't get current local-dirs from NodeManager
                 Key: SPARK-15371
                 URL: https://issues.apache.org/jira/browse/SPARK-15371
             Project: Spark
          Issue Type: Bug
          Components: Shuffle, YARN
    Affects Versions: 1.6.1, 1.6.0, 1.5.2, 1.5.1, 1.5.0, 1.6.2, 2.0.0
            Reporter: Jeff Field
            Priority: Minor


In YarnShuffleService.java, the YarnShuffleService loads in the conf settings from YARN to
get a list of local directories, and then if it doesn't find an existing levelDB file on any
of them (for recovery), it will create one in the directory that is the first element of the
list. Since it isn't asking YARN for the current list of healthy local-dirs (rather just the
ones in the config), if the first directory is a known-bad location to the NodeManager, YarnShuffleService
will continue to try to use it.

Removing the bad directory from the config fixes this, but Spark should get a current list
from YARN instead of using the list from the config. There are examples of this in https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/TestDiskFailures.java
but I'm not sure the right way for Spark to implement that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message