hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1034) Enhance datanode to read data and checksum file in parallel
Date Fri, 12 Mar 2010 00:00:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844285#action_12844285

Todd Lipcon commented on HDFS-1034:

In practice I don't imagine the extra disk seek for checksums is a problem for HBase - since
the checksum file is relatively small, my guess is that it stays hot in the linux buffer cache
and therefore doesn't represent any disk access. Would certainly be interesting to run blktrace
on a heavily loaded hbase datanode to see if this is true, though!

> Enhance datanode to read data and checksum file in parallel
> -----------------------------------------------------------
>                 Key: HDFS-1034
>                 URL: https://issues.apache.org/jira/browse/HDFS-1034
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
> In the current HDFS implementation, a read of a block issued to the datanode results
in a disk access to the checksum file followed by a disk access to the checksum file. It would
be nice to be able to do these two IOs in parallel to reduce read latency.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message