hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
Date Thu, 25 Feb 2016 00:45:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166441#comment-15166441

Kai Zheng commented on HDFS-7661:

I did a quick reading of the v2 design doc. Some comments and questions:
* Overall, I'm not sure why introducing a new meta file {{.bgLen}} for striped parity block
is better, than augmenting the existing block meta file. Using a new meta file, it will means
it has to stay along with the block where the block moves/replicates/reconstructs. Also, why
just keep it for parity blocks? Maybe not bad for all the BG blocks.
* We may need well documenting about {{offsetInBlock, packetLen, blockGroupLen}} and why we
need them. The names may be refined. Otherwise someone may wonder why such intermediate variables
need to be persisted as part of meta data.
* bq. Consider the default EC policy whose cell size 65536B (64KB), and the DFSPacket data
size is 64512B(63KB)
The assumption isn't good, because I don't think it's a good idea to have cell-size and packet-data-size
like this, not multiplied. It's hard to align the buffer address for erasure encoding and
checksum computing ({{both are performance critical}}) without buffer data copying. We should
ensure either cell-size or packet-size can fall into the other, or for simple, they're equal.

I may have more comments in the following days, thanks for addressing or clarifying them.

> Erasure coding: support hflush and hsync
> ----------------------------------------
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, HDFS-7661-unitTest-wip-trunk.patch,
HDFS-7661-wip.01.patch, HDFS-EC-file-flush-sync-design-version1.1.pdf, HDFS-EC-file-flush-sync-design-version2.0.pdf
> We also need to support hflush/hsync and visible length. 

This message was sent by Atlassian JIRA

View raw message