hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-66) dfs client writes all data for a chunk to /tmp
Date Tue, 07 Mar 2006 04:52:39 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-66?page=comments#action_12369159 ] 

eric baldeschwieler commented on HADOOP-66:
-------------------------------------------

So the problem with /tmp is that this can fill up and cause failures.  This is very config
/ install specific.  We almost never use /tmp because it gets blown out by something sometime,
always when you least expect it.  Maybe we should throw by default and provide some config
to do something else, such as provide a a file path for temp files?  This could be in /tmp
if you chose, or map reduce could default to its temp directory where it is storing everything
else.

Performance is clearly not an issue if this is truly an exceptional case.

> dfs client writes all data for a chunk to /tmp
> ----------------------------------------------
>
>          Key: HADOOP-66
>          URL: http://issues.apache.org/jira/browse/HADOOP-66
>      Project: Hadoop
>         Type: Bug
>   Components: dfs
>     Versions: 0.1
>     Reporter: Sameer Paranjpye
>      Fix For: 0.1

>
> The dfs client writes all the data for the current chunk to a file in /tmp, when the
chunk is complete it is shipped out to the Datanodes. This can cause /tmp to fill up fast
when a lot of files are being written. A potentially better scheme is to buffer the written
data in RAM (application code can set the buffer size) and flush it to the Datanodes when
the buffer fills up.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message