hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Huo (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-13056) Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts
Date Sat, 25 May 2019 20:20:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dennis Huo updated HDFS-13056:
------------------------------
    Release Note: 
<!-- markdown --> 

This feature adds a new `COMPOSITE_CRC` FileChecksum type which uses CRC composition to remain
completely chunk/block agnostic, and allows comparison between striped vs replicated files,
between different HDFS instances, and possible even between HDFS and other external storage
systems or local files.

Without this option, HDFS FileChecksums are computed as "MD5 of MD5 of CRC" across chunks
and blocks, so that the checksums are dependent on the configured chunk and block sizes, rather
than only reflecting the logical file contents. The feature works by changing the way individual
chunk and block checksums are combined at invocation time, so doesn't depend on or affect
any persistent metadata; it can be used in-place in existing HDFS deployments without any
recomputation of block metadata.

This option can be enabled or disabled at the granularity of individual client calls by setting
the new configuration option `dfs.checksum.combine.mode` to `COMPOSITE_CRC`:

    hdfs dfs -Ddfs.checksum.combine.mode=COMPOSITE_CRC -checksum hdfs:///tmp/foo.txt

By default, this configuration option is set to `MD5MD5CRC`, which preserves the fully backwards-compatible
existing behavior of HDFS.

Note that this "combine mode" is orthogonal to the existing `dfs.checksum.type` configuration
option, which configures the checksum algorithm at the level of individual chunk crcs that
are stored in block metadata, where `CRC32` specifies the legacy "gzip" CRC polynomial, and
`CRC32C` specifies the "Castagnoli" polynomial. The composite CRC mode will adhere to the
checksum type stored in the block metadata; if an HDFS instance stores a mixture of files
with `CRC32` and `CRC32C` types, the list of FileChecksums with `COMPOSITE_CRC` will likewise
produce the corresponding mixture of `COMPOSITE-CRC32` and `COMPOSITE-CRC32C` types.

> Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13056
>                 URL: https://issues.apache.org/jira/browse/HDFS-13056
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, distcp, erasure-coding, federation, hdfs
>    Affects Versions: 3.0.0
>            Reporter: Dennis Huo
>            Assignee: Dennis Huo
>            Priority: Major
>             Fix For: 3.2.0, 3.1.1
>
>         Attachments: HDFS-13056-branch-2.8.001.patch, HDFS-13056-branch-2.8.002.patch,
HDFS-13056-branch-2.8.003.patch, HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch,
HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, HDFS-13056.003.patch,
HDFS-13056.003.patch, HDFS-13056.004.patch, HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch,
HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, HDFS-13056.011.patch, HDFS-13056.012.patch,
HDFS-13056.013.patch, HDFS-13056.014.patch, Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf,
hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf
>
>
> FileChecksum was first introduced in [https://issues-test.apache.org/jira/browse/HADOOP-3981] and
ever since then has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are
already stored as part of datanode metadata, and the MD5 approach is used to compute an aggregate
value in a distributed manner, with individual datanodes computing the MD5-of-CRCs per-block
in parallel, and the HDFS client computing the second-level MD5.
>  
> A shortcoming of this approach which is often brought up is the fact that this FileChecksum
is sensitive to the internal block-size and chunk-size configuration, and thus different
HDFS files with different block/chunk settings cannot be compared. More commonly, one might
have different HDFS clusters which use different block sizes, in which case any data migration
won't be able to use the FileChecksum for distcp's rsync functionality or for verifying end-to-end
data integrity (on top of low-level data integrity checks applied at data transfer time).
>  
> This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 during the
addition of checksum support for striped erasure-coded files; while there was some discussion
of using CRC composability, it still ultimately settled on hierarchical MD5 approach, which also adds
the problem that checksums of basic replicated files are not comparable to striped files.
>  
> This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses CRC composition
to remain completely chunk/block agnostic, and allows comparison between striped vs replicated
files, between different HDFS instances, and possible even between HDFS and other external
storage systems. This feature can also be added in-place to be compatible with existing block
metadata, and doesn't need to change the normal path of chunk verification, so is minimally
invasive. This also means even large preexisting HDFS deployments could adopt this feature
to retroactively sync data. A detailed design document can be found here: https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message