hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Huo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-13191) Internal buffer-sizing details are inadvertently baked into FileChecksum and BlockGroupChecksum
Date Sat, 24 Feb 2018 02:16:00 GMT
Dennis Huo created HDFS-13191:

             Summary: Internal buffer-sizing details are inadvertently baked into FileChecksum
and BlockGroupChecksum
                 Key: HDFS-13191
                 URL: https://issues.apache.org/jira/browse/HDFS-13191
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs, hdfs-client
            Reporter: Dennis Huo

The org.apache.hadoop.io.DataOutputBuffer is used as an "optimization" in many places to allow
a reusable form of ByteArrayOutputStream, but requires the caller to be careful to use getLength()
instead of getData().length to determine the number of actually valid bytes to consume.

At least three places in the path of constructing FileChecksums have incorrect usage of DataOutputBuffer:

[FileChecksumHelper digesting block MD5s|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/FileChecksumHelper.java#L239]

[BlockChecksumHelper digesting striped block MD5s to construct block-group checksum|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockChecksumHelper.java#L412]


The net effect is that FileChecksum consumes exact BlockChecksums if there are 1 or 2 blocks
(at 16 and 32 bytes respectively), but at 3 blocks will round up to 64 bytes, effectively
returning the same FileChecksum as if there were 4 blocks and the 4th block happened to have
an MD5 exactly equal to 0x00...00. Similarly, BlockGroupChecksum will behave as if there is
a power-of-2 number of bytes from BlockChecksums in the BlockGroup.

This appears to have been a latent bug for at least 9 years for FileChecksum (and since inception
for the implementation of striped files), and works fine as long as HDFS implementations strictly
stick to the same internal buffering semantics.

However, this also makes the implementation extremely brittle unless carefully documented.
For example, if code is ever refactored to pass around a MessageDigest that consumes block
MD5s as they come rather than writing into a DataOutputBuffer before digesting the entire
buffer, then the resulting checksum calculations will change unexpectedly.

At the same time, "fixing" the bug would also be backwards-incompatible, so the bug might
need to stick around. At least for the FileChecksum-level calculation, it seems the bug has
been latent for a very long time. Since striped files are fairly new, the BlockChecksumHelper
could probably be fixed sooner rather than later to avoid perpetuating a bug. The getBytes()
method for FileChecksum is more innocuous, so could likely be fixed or left as-is without
too much impact either way.

The bug can be highlighted by changing the internal buffer-growing semantics of the DataOutputBuffer,
or simply returning a randomly-sized byte buffer in getData() while only ensuring the first
getLength() bytes are actually present, for example:

diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
index 4c2fa67f8f2..f2df94e898f 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
@@ -103,7 +103,17 @@ private DataOutputBuffer(Buffer buffer) {
/** Returns the current contents of the buffer.
* Data is only valid to {@link #getLength()}.
- public byte[] getData() { return buffer.getData(); }
+ public byte[] getData() {
+ java.util.Random rand = new java.util.Random();
+ byte[] bufferData = buffer.getData();
+ byte[] ret = new byte[rand.nextInt(bufferData.length) + bufferData.length];
+ System.arraycopy(bufferData, 0, ret, 0, getLength());
+ return ret;
+ }


This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message