hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1557) Deletion of excess replicas should prefer to delete corrupted replicas before deleting valid replicas
Date Tue, 03 Jul 2007 17:33:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509949
] 

Doug Cutting commented on HADOOP-1557:
--------------------------------------

> when a setReplication() command is sent to the NameNode, no data blocks are being read

Right.  But a setReplication() triggers replications, and, when those replications happen,
the data is read.  If, when writing the replica, the checksum of the received data does not
match the checksum sent with that data, the receiving datanode should report to the namenode
that the data was corrupt and abort the replication.  This would cause the source block to
be removed (provided there are more replicas) and the namenode to initiate new replications
from a different source.

After HADOOP-1134, datanodes should always validate checksums as blocks are written.  Whenever
there's a mismatch, the write should be aborted.  If the write is a replication (as opposed
to an initial write) the mismatch should be reported to the namenode.  Does that sound like
the right policy to you?

> Deletion of excess replicas should prefer to delete corrupted replicas before deleting
valid replicas
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1557
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1557
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> Suppose a block has three replicas and two of the replicas are corrupted. If the replication
factor of the file is reduced to 2. The filesystem should preferably delete the two corrupted
replicas, otherwise it could lead to a corrupted file.
> One option would be to make the datanode periodically validate all blocks with their
corresponding CRCs. The other option would be to make the setReplication call validate existing
replicas before deleting excess replicas.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message