hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Antal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13960) hdfs dfs -checksum command should optionally show block size in output
Date Tue, 30 Oct 2018 09:23:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668406#comment-16668406

Adam Antal commented on HDFS-13960:

Great, thanks! Its a low prio issue, so it's not urgent, just checked in.

> hdfs dfs -checksum command should optionally show block size in output
> ----------------------------------------------------------------------
>                 Key: HDFS-13960
>                 URL: https://issues.apache.org/jira/browse/HDFS-13960
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Adam Antal
>            Assignee: Lokesh Jain
>            Priority: Minor
> The hdfs checksum command computes the checksum in a distributed manner, which would
take into account the block size. In other words, the block size determines how the file will
be broken up.
> Therefore itĀ can happen that the checksum command produces different outputs for the
exact same file only differing in the block size: checksum(fileABlock1) + checksum(fileABlock2)
!= checksum(fileABlock1 + fileABlock2)
> I suggest to add an option to the hdfs dfs -checksum command which would displays the
block sizeĀ along with the output, and that could also be helpful in some other cases where
this piece of information is needed.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message