hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream
Date Tue, 08 Dec 2009 21:37:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787756#action_12787756

Raghu Angadi commented on HDFS-755:

Thanks for the benchmarks. What does this translate to for user cpu improvement (say with
32 byte buffer in DFSClient)?

I think the internal buffer should be small for for this patch. It does not matter whether
a user always wraps with another buffer or not... they essentially get performance inline
with  their read size. The bufferSize passed to FSInputChecker is essentially a hint.

I need to look more into limit on CHUNKS_PER_READ. I don't see much of reason to limit it
in FSInputChecker (within limits), if users invokes a read with large buffer, underlying FS
(DFSClient in this case) should have access to that buffer...

> Read multiple checksum chunks at once in DFSInputStream
> -------------------------------------------------------
>                 Key: HDFS-755
>                 URL: https://issues.apache.org/jira/browse/HDFS-755
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: benchmark-8-256.png, benchmark.png, hdfs-755.txt, hdfs-755.txt,
hdfs-755.txt, hdfs-755.txt, hdfs-755.txt
> HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple checksum
chunks in a single call to readChunk. This is the HDFS-side use of that new feature.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message