hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files
Date Thu, 14 Jan 2016 05:48:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097645#comment-15097645

Tsz Wo Nicholas Sze commented on HDFS-8430:

[~drankye], sorry for the late reply.  Your suggestion sounds good in general.  Some minor

> First, add a new API like getFileChecksum(int cell) using the New Algorithm 2. ...

It is better to add the new API as getFileChecksum(String algorithm) since it is more general
and more in sync with the Java API such as MessageDigest.  We don't want to change/modify
the FileSystem API further if we want to support different algorithms in the future.

We may need another FileSystem API supportFileChecksum(String algorithm) for distcp or other
tools to check if a particular algorithm is supported; see below.

> distcp will be updated to favor the new APIs and use the two APIs appropriately. ...

distcp probably needs to first check if the same algorithm supported in both the source and
the destination clusters.  If they don't support the same algorithm, it may fall back to use
file length.

Thanks a lot!

> Erasure coding: compute file checksum for stripe files
> ------------------------------------------------------
>                 Key: HDFS-8430
>                 URL: https://issues.apache.org/jira/browse/HDFS-8430
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Walter Su
>            Assignee: Kai Zheng
>         Attachments: HDFS-8430-poc1.patch
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed for replicated
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped block group.

This message was sent by Atlassian JIRA

View raw message