hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8033) Erasure coding: stateful (non-positional) read from files in striped layout
Date Thu, 23 Apr 2015 12:05:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508937#comment-14508937
] 

Walter Su commented on HDFS-8033:
---------------------------------

bq. the size is most likely 64K bytes (BlockSender#MIN_BUFFER_WITH_TRANSFERTO. I had an offline
discussion with Colin Patrick McCabe about this behavior. Given our default cell size (128K
or 256K), if we inherit the behavior from DFSInputStream and return 64K at a time, in most
cases we won't cross cell boundary in a single read() anyway.
Assume 128K cell size. You give len == 64k, You call read() twice. You get 2 * 64k = 128k.
It's ok.
Assume MIN_BUFFER_WITH_TRANSFERTO == 96k, and you use bytebuffer. You give len == 64k, You
call read() twice. You want 128k, but you get 2 * 96k = 192k.

User may not config "io.file.buffer.size" == 4099. But user may config "io.file.buffer.size"
== 4k * 3 == 12k, or 4k *5 == 20k. ( Notice that 128k%12k != 0 )

> Erasure coding: stateful (non-positional) read from files in striped layout
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-8033
>                 URL: https://issues.apache.org/jira/browse/HDFS-8033
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch, HDFS-8033.002.patch, HDFS-8033.003.patch,
hdfs8033-HDFS-7285.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message