hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread
Date Mon, 20 Jul 2015 12:15:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633492#comment-14633492

Walter Su commented on HDFS-8760:

LGTM. +1 after address one minor issue: (related)
updateReadStatistics(..) is called twice, in readCells(..) and ByteBufferStrategy.doRead(..)

I found other issues while reviewing the patch: (not related)
1. some util functions are static import from StripedBlockUtil, while others are called by
2. DFSStripedInputStream.read(Bytebuffer) is identical with the one in super class.
3. StripeReader / readStripe(..) the "stripe" means AlignedStripe, may across many real stripes.
Need some javadoc.
4. Suppose {{buf}} is the buffer given by user. Pread() makes blockReader directly put data
to {{buf}}. Stateful read() needs blockReader put data to curStripeBuf, then copy curStripeBuf
to {{buf}}. curStripeBuf is useful when user calls read()/read(small buf) frequently, especially
when there are bad DN. I think if buf.size > curStripeBuf.size we can directly write data
to buf without curStripeBuf.
Maybe copy is fine. But why is it a DirectByteBuffer? I don't know how does it help decoding,
but it's bad that copy data from heap to native memory, then copy from native memory to heap,
if there's no need to decode.
We need a further digging.

> Erasure Coding: reuse BlockReader when reading the same block in pread
> ----------------------------------------------------------------------
>                 Key: HDFS-8760
>                 URL: https://issues.apache.org/jira/browse/HDFS-8760
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-8760.000.patch
> Currently in pread, we create a new block reader for each aligned stripe even though
these stripes belong to the same block. It's better to reuse them to avoid unnecessary block
reader creation overhead. This can also avoid reading from the same bad DataNode.

This message was sent by Atlassian JIRA

View raw message