hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-66) dfs client writes all data for a chunk to /tmp
Date Tue, 07 Mar 2006 00:26:29 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-66?page=comments#action_12369110 ] 

Doug Cutting commented on HADOOP-66:
------------------------------------

It looks to me like the temp file is only in fact used when the connection to the datanode
fails.  Normally the block is streamed to the datanode as it is written.  But if the connection
to the datanode fails then an application exception is not thrown, instead the temp file is
used to recover, by reconnecting to a datanode and trying to write the block again.

Data is bufferred in RAM first, just in chunks much smaller than the block.  I don't think
we should buffer the entire block in RAM, as this would, e.g., prohibit applications which
write lots of files in parallel.

We could get rid of the temp file and simply throw an application exception when we lose a
connection to a datanode while writing.  What is the objection to the temp file?

> dfs client writes all data for a chunk to /tmp
> ----------------------------------------------
>
>          Key: HADOOP-66
>          URL: http://issues.apache.org/jira/browse/HADOOP-66
>      Project: Hadoop
>         Type: Bug
>   Components: dfs
>     Versions: 0.1
>     Reporter: Sameer Paranjpye
>      Fix For: 0.1

>
> The dfs client writes all the data for the current chunk to a file in /tmp, when the
chunk is complete it is shipped out to the Datanodes. This can cause /tmp to fill up fast
when a lot of files are being written. A potentially better scheme is to buffer the written
data in RAM (application code can set the buffer size) and flush it to the Datanodes when
the buffer fills up.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message