hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8033) Erasure coding: stateful (non-positional) read from files in striped layout
Date Tue, 21 Apr 2015 23:16:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505991#comment-14505991

Zhe Zhang commented on HDFS-8033:

Thanks for the helpful comments Yi and Jing.

bq. In DFSStripedInputStream we override readBuffer, but we only read in one striped block,
so the returned result should be something like (cell_0, cell_3, ....) and it only contains
part of the expected data
{{DFSStripedInputStream#readBuffer}} does switch the {{blockReader}}. So after reading cell_0,
we'll switch to the next {{blockReader}} and read cell_1. 

It's very helpful that you brought up the _short read_ issue. In current {{DFSInputStream}},
stateful read calls {{blockReader.read()}} once, which returns all remaining data in the {{blockReader}}'s
buffer; the size is most likely 64K bytes ({{BlockSender#MIN_BUFFER_WITH_TRANSFERTO}}. I had
an offline discussion with [~cmccabe] about this behavior. It seems the rationale is to return
as fast as possible with all cached data. Given our default cell size (128K or 256K), if we
inherit the behavior from {{DFSInputStream}} and return 64K at a time, in most cases we won't
cross cell boundary in a single {{read()}} anyway. So I didn't add the logic of reading across
cell boundary in the patch. It's not too hard to add though, once we make a decision. But
I think we should keep the behavior of trying to return with buffered data (instead of trying
to read up to the request length). 

bq. In blockSeekTo, we need to handle refetchToken and refetchEncryptionKey. And for other
IOException, we can throw it.
Good point. Since all EC internal blocks only has 1 destination DN, we won't have the _while_
loop to count retries. We can retry on different internal blocks.

bq. For the test, do stateful read: read once and fully read (please make the data size large
than groupSize * cellSize), as I said in #1,
Will test reading multiple {{BLOCK_GROUP_SIZE}} to verify {{blockSeekTo}} switches between
block groups correctly.

bq. connectFailedOnce in blockSeekTo is not necessary.
I agree, will remove it.

bq. Why you modify SimulatedFSDataset?
Once HDFS-8191 is in that won't be needed.

> Erasure coding: stateful (non-positional) read from files in striped layout
> ---------------------------------------------------------------------------
>                 Key: HDFS-8033
>                 URL: https://issues.apache.org/jira/browse/HDFS-8033
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch

This message was sent by Atlassian JIRA

View raw message