hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1606) Provide a stronger data guarantee in the write pipeline
Date Fri, 08 Apr 2011 21:50:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017675#comment-13017675
] 

Jitendra Nath Pandey commented on HDFS-1606:
--------------------------------------------

1. The method findNewDatanode should return all new datanodes in case there are more than
one new datanode.

2. The method addDatanode2ExistingPipeline can be split into following methods
    a) method to check if transfer is needed.
    b) method to get additional datanodes and determine source and destination
    c) method that does actual transfer.

3. DataStreamer#hflush : Should we change it to setHflush(boolean val) to clarify its just
setting a flag?

4. Does it make sense to add a unit test for default ReplaceDatanodeOnFailure policy? 

> Provide a stronger data guarantee in the write pipeline
> -------------------------------------------------------
>
>                 Key: HDFS-1606
>                 URL: https://issues.apache.org/jira/browse/HDFS-1606
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, hdfs client, name-node
>    Affects Versions: 0.23.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.23.0
>
>         Attachments: h1606_20110210.patch, h1606_20110211.patch, h1606_20110217.patch,
h1606_20110228.patch, h1606_20110404.patch, h1606_20110405.patch, h1606_20110405b.patch, h1606_20110406.patch,
h1606_20110406b.patch, h1606_20110407.patch, h1606_20110407b.patch, h1606_20110407c.patch,
h1606_20110408.patch
>
>
> In the current design, if there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline and then continue writing
with the remaining datanodes.  As a result, the number of datanodes in the pipeline is decreased.
 Unfortunately, it is possible that DFSClient may incorrectly remove a healthy datanode but
leave the failed datanode in the pipeline because failure detection may be inaccurate under
erroneous conditions.
> We propose to have a new mechanism for adding new datanodes to the pipeline in order
to provide a stronger data guarantee.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message