hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akira AJISAKA <ajisa...@oss.nttdata.co.jp>
Subject Re: DistCp CRC failure modes
Date Thu, 28 Apr 2016 05:43:31 GMT
Thank you, Elliot!

On 4/28/16 03:40, Elliot West wrote:
> I've raised this as an issue:
>
> https://issues.apache.org/jira/browse/HDFS-10338
>
> On Wednesday, 27 April 2016, Elliot West <teabot@gmail.com
> <mailto:teabot@gmail.com>> wrote:
>
>     Hello,
>
>     We are using DistCp V2 to replicate data between two HDFS file
>     systems. We were working on the assumption that we could rely on CRC
>     checks to ensure that the data was replicated correctly. However,
>     after examining the DistCp source code it seems that there are edge
>     cases where the CRCs could differ and yet the copy succeeds even
>     when we are not skipping CRC checks.
>
>     I'm wondering whether this is by design and if so, the reasoning
>     behind it? If this is a bug, I'd like to raise an issue to fix it.
>     If it is by design, I'd like to propose the introduction an option
>     for stricter CRC checks.
>
>     The code in question is contained in the method:
>
>         org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)
>
>     which can be seen here:
>
>         https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457
>
>
>     Specifically this code block suggests that if there is a failure
>     when trying to read the source or target checksum then the method
>     will return 'true', implying that the check succeeded. In actual
>     fact we just failed to obtain the checksum and could perform no check.
>
>          try {
>            sourceChecksum = sourceChecksum != null ? sourceChecksum :
>     sourceFS
>                .getFileChecksum(source);
>            targetChecksum = targetFS.getFileChecksum(target);
>          } catch (IOException e) {
>            LOG.error("Unable to retrieve checksum for " + source + " or
>     " + target, e);
>          }
>          return (sourceChecksum == null || targetChecksum == null ||
>                  sourceChecksum.equals(targetChecksum));
>
>     Ideally I'd like to be able to configure a check where we require
>     that both the source and target CRCs are retrieved and compared, and
>     if for any reason either of the CRCs retrievals fail then an
>     exception is thrown. I do appreciate that some FileSystems cannot
>     return CRCs but these could still be handled correctly as they would
>     simply return null and not throw an exception (I assume).
>
>     I'd appreciate any thoughts on this matter.
>
>     Elliot.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Mime
View raw message