hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
Date Sat, 22 Sep 2012 05:27:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461028#comment-13461028

stack commented on HDFS-3429:

Thanks for having a go at this one Todd.

-    if ( bytesPerChecksum <= 0 ) {
+    if ( type != Type.NULL && bytesPerChecksum <= 0 ) {

Type.NULL + bytesPerChecksum <= 0 is the flag that means 'skip checksum'?  If so, a comment
wouldn't be amiss here.

Why not just let it fall through to Type.NULL?  It'll return DataChecksum w/ ChecksumNull.

Its ok adding extra param here:

-      final long length) throws IOException;
+      final long length,
+      final boolean sendChecksum) throws IOException;

It won't break the protocol?  We can go against older versions of 2.0.x-alpha?  I suppose
we're pb'ing -- I can see that later in patch -- so probably fine?

Is this change related?  (Reading more, it just looks like you just moved the check higher
up in the method -- ok)

+      length = length < 0 ? replicaVisibleLength : length;


+        	if (metaIn == null) {

Javadoc param name does not match name you have in method sig: maxBytesToSend

Looks like this patch defaults reading the checksum and sending it to the client.  Is that
new?  Sending client the checksum?  The verify flag is already in the proto just not hooked

I got lost trying to follow the sizings in BlockSender... if its wrong, failure should be
pretty spectacular.

Patch looks good Todd.

> DataNode reads checksums even if client does not need them
> ----------------------------------------------------------
>                 Key: HDFS-3429
>                 URL: https://issues.apache.org/jira/browse/HDFS-3429
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3429.txt
> Currently, even if the client does not want to verify checksums, the datanode reads them
anyway and sends them over the wire. This means that performance improvements like HBase's
application-level checksums don't have much benefit when reading through the datanode, since
the DN is still causing seeks into the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message