hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LiuLei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
Date Sat, 10 Nov 2012 09:27:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494594#comment-13494594
] 

LiuLei commented on HDFS-3429:
------------------------------

I say my understand for this problem, there are two purposes the DN need to read checksum
form meta file.
1. Server need to verify checksum, example Block scanner.
2. DFSClient need to verify checksum, in te case, DN read checksum but don't verify checnk,
 instead , DN send checksum to DFSClient, DFSClient verify checksum.

So we need to two parameters to indicate the two purposes.
1. Constructor of BlockSender class has contained one verifyChecksum parameter, that can represent
Server whether verify checksum.
2. FileSystem.setVerifyChecksum(boolean verifyChecksum) method can represent DFSClient whether
verify checksum, so we need to send the parameter value to DN, and add one isClientVerifyChecksum
parameter in BlockSender constructor。

If verifyChecksum and isClientVerifyChecksum parameters all are false, DN don't need to read
checksum, and only need to send data to client, in the case, we only need to create one DataChecksum.CHECKSUM_NULL
instance, the instance can guarantee DN don't read checksum form meta file(because the checksumSize
of the DataChecksum.CHECKSUM_NULL instance is 0).


The patch I commit contain these modifies. 



 

 

                
> DataNode reads checksums even if client does not need them
> ----------------------------------------------------------
>
>                 Key: HDFS-3429
>                 URL: https://issues.apache.org/jira/browse/HDFS-3429
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3429-0.20.2.patch, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode reads them
anyway and sends them over the wire. This means that performance improvements like HBase's
application-level checksums don't have much benefit when reading through the datanode, since
the DN is still causing seeks into the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message