hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-3429) DataNode reads checksums even if client does not need them
Date Mon, 12 Nov 2012 03:38:13 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-3429:

    Attachment: hdfs-3429.txt

Aha, I understand now what your patch is doing - the advantage of my earlier version was that
it didn't enforce any "chunking" on the data path, but it made the code more complicated.

Here's an updated version of my trunk patch which takes an approach more similar to Liu Lei's.
The patch should be smaller, and seems to still work.
> DataNode reads checksums even if client does not need them
> ----------------------------------------------------------
>                 Key: HDFS-3429
>                 URL: https://issues.apache.org/jira/browse/HDFS-3429
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3429-0.20.2.patch, hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt
> Currently, even if the client does not want to verify checksums, the datanode reads them
anyway and sends them over the wire. This means that performance improvements like HBase's
application-level checksums don't have much benefit when reading through the datanode, since
the DN is still causing seeks into the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message