hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1630) Checksum fsedits
Date Fri, 18 Feb 2011 16:13:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996481#comment-12996481
] 

Steve Loughran commented on HDFS-1630:
--------------------------------------

@Hairong -yes, if every TX is checksummed, that's all that is needed. But I think the remote
node receiving any checksum should have the right to report the problem so the namenode can
then replay it. I don't think the risk of corruption is that high,  but statistics is the
enemy here, eventually some NIC with built in TCP support will start playing up and your TXs
get corrupted before the packet checksum is generated, and then it's no use to the recipient.
If the 2ary/backup node can check on receipt and not replay, problems get found faster, and
the faulting hardware more easily located 


> Checksum fsedits
> ----------------
>
>                 Key: HDFS-1630
>                 URL: https://issues.apache.org/jira/browse/HDFS-1630
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>
> HDFS-903 calculates a MD5 checksum to a saved image, so that we could verify the integrity
of the image at the loading time.
> The other half of the story is how to verify fsedits. Similarly we could use the checksum
approach. But since a fsedit file is growing constantly, a checksum per file does not work.
I am thinking to add a checksum per transaction. Is it doable or too expensive?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message