hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8970) Need a different environment variable or configuration that states where local temporary files are stored than hadoop.tmp.dir
Date Thu, 25 Oct 2012 13:45:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484119#comment-13484119
] 

Allen Wittenauer commented on HADOOP-8970:
------------------------------------------

A bit of history...

hadoop.tmp.dir defaults to /tmp to make it easier to run QA tests and to get something up
quickly.  On "real" systems, this should be one of the first things changed.  

On the user side...

I think one of the fundamental problems is that end users see 'hadoop.tmp.dir' and think "Hey,
I have some temporary files and I'm using Hadoop!  This must be the place!"  

I've been thinking more and more about changing hadoop.tmp.dir during task execution to be
the same value as mapred.child.tmp, which is what users are supposed to use.  The other thing
is that hadoop.tmp.dir should just get replaced with hadoop-daemon.tmp.dir so that it's perfectly
clear what the intent of this variable actually is.

                
> Need a different environment variable or configuration that states where local temporary
files are stored than hadoop.tmp.dir
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8970
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8970
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Robert Justice
>
> I'm finding that hadoop.tmp.dir is used for a base directory in configuration of working
directories for many other hadoop sub components (mapred, hdfs, hue, etc) and that it directs
where the Hadoop client stores some local temporary files, as well as temporary files on HDFS.
 
> Users may be dealing with tight space in /tmp.  In order to move where job setup files,
hive, hue files, etc, are locally stored, they have to create a new directory on HDFS (i.e.
/temp) and local directories on another filesystem and make sure permissions are setup properly
in HDFS and for the local filesystem across all the nodes across the cluster.
> I'm wondering if it would be better to have a hadoop.local.tmp.dir that is configurable
at the client level to say where local files are kept, and break that out from hadoop.tmp.dir?
 Know this is a major change, but thought I would suggest it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message