hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3889) distcp silently ignores missing checksums
Date Wed, 05 Sep 2012 01:11:07 GMT
Colin Patrick McCabe created HDFS-3889:

             Summary: distcp silently ignores missing checksums
                 Key: HDFS-3889
                 URL: https://issues.apache.org/jira/browse/HDFS-3889
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: tools
    Affects Versions: 2.2.0-alpha
            Reporter: Colin Patrick McCabe
            Priority: Minor

If distcp can't read the checksum files for the source and destination files-- for any reason--
it ignores the checksums and overwrites the destination file.  It does produce a log message,
but I think the correct behavior would be to throw an error and stop the distcp.

If the user really wants to ignore checksums, he or she can use {{-skipcrccheck}} to do so.

The relevant code is in DistCpUtils#checksumsAreEquals:
    try {
      sourceChecksum = sourceFS.getFileChecksum(source);
      targetChecksum = targetFS.getFileChecksum(target);
    } catch (IOException e) {
      LOG.error("Unable to retrieve checksum for " + source + " or " + target, e);

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message