hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout
Date Thu, 30 Apr 2015 01:12:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520661#comment-14520661

Jing Zhao commented on HDFS-8281:

Thanks for the comment, Zhe! One scenario this stripe buffer design will suffer is when the
read pattern is always "short read + seek beyond stripe", since buffered data can be wasted.
However, this scenario is also a big challenge when one or more DataNodes with data blocks
are not readable, in which case each read may have to re-read a lot of data and do decoding.
Also the implementation will be much more complicated. Considering our current main use case
for EC is cold data backup, where reading EC files should be mainly sequential read, I think
to have this stripe buffer is a good choice.  Also we can use a smaller cell size (e.g., 64KB)
to decrease the buffer size.

> Erasure Coding: implement parallel stateful reading for striped layout
> ----------------------------------------------------------------------
>                 Key: HDFS-8281
>                 URL: https://issues.apache.org/jira/browse/HDFS-8281
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-8281.000.patch
> This jira aims to support parallel reading for stateful read in {{DFSStripedInputStream}}.

This message was sent by Atlassian JIRA

View raw message