hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1707) Remove the DFS Client disk-based cache
Date Wed, 31 Oct 2007 20:47:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

dhruba borthakur updated HADOOP-1707:

    Attachment: clientDiskBuffer.patch

This is a *very very preliminary* patch that packetize writes from clients. It does not do
any error recovery at all. 

We discussed a proposal where datanodes do local recovery. If a datanode fails, the datanode
immediately preceeding it will recreate the pipeline by ignoring the one that failed and connecting
directly to the datanode that followed the one that failed. This approach has the disadvantage
that in the case of multiple failures, two upstream datanodes might be in recovery and both
of then might try to resend the block to a downstream datanode simultaneously. This might
be a difficult case to handle.

Also, the earlier proposal generated an exception to the client if the primary datanode fails.
This might be a commonly occuring case. If we would want to avoid this problem, then the client
has to do Recovery (over and above any datanodes doing local recovery). In this case, maybe
it is better to drive the entire recovery from a single point : the client.

The cascading timeouts issue has to be handled somehow. Your proposal of setting different
timeouts for datanodes in the pipeline will work but it might be a little tricky to implement
and debug. Another approach would be for each datanode to expose a new  "ping" RPC. The Client,
when it has to recover, "pings" each Datanode and determines which of them are not responding.
This seems to work, isn't it?

> Remove the DFS Client disk-based cache
> --------------------------------------
>                 Key: HADOOP-1707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1707
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.16.0
>         Attachments: clientDiskBuffer.patch
> The DFS client currently uses a staging file on local disk to cache all user-writes to
a file. When the staging file accumulates 1 block worth of data, its contents are flushed
to a HDFS datanode. These operations occur sequentially.
> A simple optimization of allowing the user to write to another staging file while simultaneously
uploading the contents of the first staging file to HDFS will improve file-upload performance.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message