hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3875) Issue handling checksum errors in write pipeline
Date Thu, 30 Aug 2012 22:50:07 GMT
Todd Lipcon created HDFS-3875:

             Summary: Issue handling checksum errors in write pipeline
                 Key: HDFS-3875
                 URL: https://issues.apache.org/jira/browse/HDFS-3875
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node, hdfs client
    Affects Versions: 2.2.0-alpha
            Reporter: Todd Lipcon

We saw this issue with one block in a large test cluster. The client is storing the data with
replication level 2, and we saw the following:
- the second node in the pipeline detects a checksum error on the data it received from the
first node. We don't know if the client sent a bad checksum, or if it got corrupted between
node 1 and node 2 in the pipeline.
- this caused the second node to get kicked out of the pipeline, since it threw an exception.
The pipeline started up again with only one replica (the first node in the pipeline)
- this replica was later determined to be corrupt by the block scanner, and unrecoverable
since it is the only replica

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message