hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5661) ShuffleHandler using yarn.nodemanager.local-dirs instead of mapreduce.cluster.local.dir
Date Mon, 02 Dec 2013 16:23:41 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836652#comment-13836652

Jason Lowe commented on MAPREDUCE-5661:

Note that YarnChild.configureLocalDirs sets this property based on an environment variable,
which is itself derived from yarn.nodemanager.local-dirs, and therefore most of the other
references are really coming from what was specified in yarn.nodemanager.local-dirs and not
what was configured by an admin.  The notable exception would be jobs run in local mode.

Also note that the shuffle handler is a bit special in that it is the one piece of MapReduce
code that runs as part of the YARN nodemanager process and not as part of a job or a client.
 It is more likely yarn.nodemanager.local-dirs is configured on a particular YARN node than
mapreduce.cluster.local.dir, so I think it's appropriate that variable is used in the shuffle
handler case.  I don't think mapreduce.cluster.local.dir is even set on some of our clusters,
as the MapReduce framework configures this variable for tasks when running under YARN.  I
wouldn't expect it to have to be configured by admins at all unless supporting jobs in local
mode and for some reason the default isn't sufficient.

> ShuffleHandler using yarn.nodemanager.local-dirs instead of mapreduce.cluster.local.dir
> ---------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5661
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5661
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Eric Sirianni
>            Priority: Trivial
> While debugging an issue where a MapReduce job is failing due to running out of disk
space, I noticed that the {{ShuffleHandler}} uses {{yarn.nodemanager.local-dirs}} for its
{{LocalDirAllocator}} whereas all of the other MapReduce classes use {{mapreduce.cluster.local.dir}}:
> {noformat}
> $ find hadoop-mapreduce-project/hadoop-mapreduce-client/*/src/main/java/ -name "*.java"
| xargs grep "new LocalDirAllocator("
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java:
   LocalDirAllocator lDirAlloc = new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnOutputFiles.java:
   new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalDistributedCacheManager.java:
     new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/BackupStore.java:
     this.lDirAlloc = new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MROutputFiles.java:
   new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java:
   new LocalDirAllocator(MRConfig.LOCAL_DIR);
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java:
   this.lDirAlloc = new LocalDirAllocator(MRConfig.LOCAL_DIR);
> *****hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java:
     new LocalDirAllocator(YarnConfiguration.NM_LOCAL_DIRS);
> {noformat}
> This inconsistency feels like something that is likely to confuse admins.  

This message was sent by Atlassian JIRA

View raw message