hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-1606) Provide a stronger data guarantee in the write pipeline
Date Sat, 09 Apr 2011 00:56:06 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsz Wo (Nicholas), SZE updated HDFS-1606:
-----------------------------------------

    Attachment: h1606_20110408b.patch

Thanks Jitendra for the review.

h1606_20110408b.patch:

> 1. The method findNewDatanode should return all new datanodes ...

Since it is only an internal method but not protocol or public API, we may easily change it
later when we add the multiple destination feature.


> 2. The method addDatanode2ExistingPipeline can be split ...

I only split actual transfer out.  The remaining codes only has 20 lines excluding comments.

> 3. DataStreamer#hflush : Should we change it to setHflush(boolean val) to clarify its
just setting a flag?

Changed.

> 4. Does it make sense to add a unit test for default ReplaceDatanodeOnFailure policy?

Added {{testDefaultPolicy()}}.


> Provide a stronger data guarantee in the write pipeline
> -------------------------------------------------------
>
>                 Key: HDFS-1606
>                 URL: https://issues.apache.org/jira/browse/HDFS-1606
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, hdfs client, name-node
>    Affects Versions: 0.23.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.23.0
>
>         Attachments: h1606_20110210.patch, h1606_20110211.patch, h1606_20110217.patch,
h1606_20110228.patch, h1606_20110404.patch, h1606_20110405.patch, h1606_20110405b.patch, h1606_20110406.patch,
h1606_20110406b.patch, h1606_20110407.patch, h1606_20110407b.patch, h1606_20110407c.patch,
h1606_20110408.patch, h1606_20110408b.patch
>
>
> In the current design, if there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline and then continue writing
with the remaining datanodes.  As a result, the number of datanodes in the pipeline is decreased.
 Unfortunately, it is possible that DFSClient may incorrectly remove a healthy datanode but
leave the failed datanode in the pipeline because failure detection may be inaccurate under
erroneous conditions.
> We propose to have a new mechanism for adding new datanodes to the pipeline in order
to provide a stronger data guarantee.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message