hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout
Date Tue, 05 May 2015 00:27:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527629#comment-14527629
] 

Kai Zheng commented on HDFS-8281:
---------------------------------

Thanks for the great work and discussion here!
bq. One question is why we choose 256KB as the cell size instead of the original 64KB?
bq. Kai maybe you can remind us the reason?
Originally 64KB was used as a HDFS side constant for the stripping cell size; 256KB was used
in in ECSchema in the codec framework as erasure coding chunk size. The both were going independently.
When we applied ECSchema to replace the hard-coded values, the value 256KB was used to make
all the places consistent. In my view, 64KB or smaller may be better for stripping of small
files; 256KB or larger may be better for erasure coding of big files. We have test records
indicating with larger chunk size like 32MB native coders can outperform greatly. Though it's
a little hard to choose the good default value, we support configurable schema and the chunk
size is configurable as part of a schema, so we may don't need worry about that too much.
How do you think of? Thanks.

> Erasure Coding: implement parallel stateful reading for striped layout
> ----------------------------------------------------------------------
>
>                 Key: HDFS-8281
>                 URL: https://issues.apache.org/jira/browse/HDFS-8281
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>             Fix For: HDFS-7285
>
>         Attachments: HDFS-8281-HDFS-7285.001.patch, HDFS-8281-HDFS-7285.001.patch, HDFS-8281-HDFS-7285.002.patch,
HDFS-8281.000.patch
>
>
> This jira aims to support parallel reading for stateful read in {{DFSStripedInputStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message