hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1707) DFS client can allow user to write data to the next block while uploading previous block to HDFS
Date Wed, 10 Oct 2007 23:26:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533903
] 

dhruba borthakur commented on HADOOP-1707:
------------------------------------------

The local disk cache implementation is similar to creating four replicas of the block and
then deleting the excess replica when the block is done. This reduces overall cluster throughput
and I would like to analyze ways of getting rid of it.

I agree that I should change the name-description of this JIRA. Will do.

> DFS client can allow user to write data to the next block while uploading previous block
to HDFS
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1707
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The DFS client currently uses a staging file on local disk to cache all user-writes to
a file. When the staging file accumulates 1 block worth of data, its contents are flushed
to a HDFS datanode. These operations occur sequentially.
> A simple optimization of allowing the user to write to another staging file while simultaneously
uploading the contents of the first staging file to HDFS will improve file-upload performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message