hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "GAO Rui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
Date Wed, 06 Apr 2016 06:28:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227810#comment-15227810
] 

GAO Rui commented on HDFS-7661:
-------------------------------

Hi [~zhz], [~liuml07] and I have discussed about {{two version cells undo log design}}. I
have attached an illustration graph [^Undo-Log-Design-20160406.jpg].

In the design, Undo Log is consisted of three parts, first and second part used to store latest
flushed parity cell and current flushed parity cell. The length of these two parts was depended
on EC policy. Each part could to long enough to store a full cell and it's checksum. The third
part of Undo Log is a list of flush records just the same as described in the design document,
only the latest successfully flushed cell pointer is added. This list could be appended as
much times as needed(Generally, this would not cause the Undo Log to be too big).

With the third part of Undo Log, two phrase commit mechanism could be used to control the
data safety.
For example:
   1. The last successfully flushed cell was stored in parity-cell-1(the second part of Undo
Log).
   2. Current flush happens.
   3. The first part of the Undo Log file updated according to current flushed parity cell.
   4. New record added to the third part(the record list) of Undo Log file.

For failure happens during step.3 (The first part of the Undo Log file updated.), we still
have latest successful flushed parity cell in the second part of Undo Log. And the last record
of the record list is pointing the second part as well.

> [umbrella] support hflush and hsync for erasure coded files
> -----------------------------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: erasure-coding
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, HDFS-7661-unitTest-wip-trunk.patch,
HDFS-7661-wip.01.patch, HDFS-EC-file-flush-sync-design-v20160323.pdf, HDFS-EC-file-flush-sync-design-version1.1.pdf,
Undo-Log-Design-20160406.jpg
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message