hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8033) Erasure coding: stateful (non-positional) read from files in striped layout
Date Thu, 23 Apr 2015 07:21:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508580#comment-14508580
] 

Walter Su commented on HDFS-8033:
---------------------------------

>... ByteBufferStrategy.doRead() ignores len argument. It always read byteBuffer.remaining,
untils EOF of the current block.
Correct myself:
ByteBufferStrategy.doRead() ignores len argument. It always read byteBuffer.remaining, untils
EOF of the current *packet*.

I read {{BlockSender.doSendBlock()}}. I found out that packet size is depended by "io.file.buffer.size"
and BlockSender.MIN_BUFFER_WITH_TRANSFERTO. If we read block locally, then size of data part
of packet is "io.file.buffer.size"(default 4096).

HdfsConstants.BLOCK_STRIPED_CELL_SIZE = 256 * 1024;
Good thing is, cellSize%packetSize == 0,  256 * 1024 /4096 == 4; so we call {{ByteBufferStrategy.doRead()}}
4 times. We can read exactly one cell.
What if cellSize%packetSize != 0? It'll be wrong.

Try config "io.file.buffer.size" == 4099. The testcase will failed. ( any other value cellSize%packetSize
!= 0 )

Your implementation for bytebuffer works now. But We have to make sure,
cellSize % ("io.file.buffer.size") ==0 (for local read)
cellSize % (BlockSender.MIN_BUFFER_WITH_TRANSFERTO) ==0 (for remote read)
*When we choose another value for cellSize , we should be careful. Otherwise read(bytebuffer)
won't work.*

> Erasure coding: stateful (non-positional) read from files in striped layout
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-8033
>                 URL: https://issues.apache.org/jira/browse/HDFS-8033
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch, HDFS-8033.002.patch, HDFS-8033.003.patch,
hdfs8033-HDFS-7285.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message