hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2181) mapreduce.jobtracker.staging.root.dir default is unreasonable
Date Wed, 10 Nov 2010 01:59:07 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930425#action_12930425
] 

Allen Wittenauer commented on MAPREDUCE-2181:
---------------------------------------------

I realize this is an HDFS dir. 

Let be more obvious:

What I'm worried about is that many sites with multiple users do:

... dfsadmin -setQuota value /user/* 

... so that all users have the same quota values.  [Making variable sizes of quotas is makes
Hadoop nearly impossible support since there is no real quota reporting capabilities, short
of traversing the file system looking for them.]  In this case, it would basically mean that
the JobTracker would be forced to contend with the same quota size as users. 

Even given your scenario above, this would mean that the JT space quota would need to be usersize*number
of users, which is a bit ridiculous to maintain.

 [If anyone actually sets /user explicitly... well, I hope they aren't multi-user or have
some sort of Plan.]

In any case, I'm still left with /user being not a good place to put system resources. There
are reasons why everyone in the UNIX world doesn't put home directories under /usr anymore.
 Mixing system bits and user bits is just bad practice.



> mapreduce.jobtracker.staging.root.dir default is unreasonable
> -------------------------------------------------------------
>
>                 Key: MAPREDUCE-2181
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2181
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: job submission, jobtracker
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>
> The default for mapreduce.jobtracker.staging.root.dir is set to ${hadoop.tmp.dir}/mapred/staging,
which doesn't really work on a normal cluster. hadoop.tmp.dir is overloaded in different places
where sometimes it is a local path and sometimes it is a path on HDFS, which makes things
even more confusing.
> We should change the default for the staging directory to /user (as is suggested by the
description of that configuration) and then fix LocalJobRunner to use a different configuration
-- perhaps mapreduce.localjobrunner.staging.root.dir -- to make it clear that it's a *local*
path. That one could legitimately default to something inside hadoop.tmp.dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message