hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
Date Fri, 20 Nov 2015 01:58:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015029#comment-15015029

Uma Maheswara Rao G commented on HDFS-9079:

Thanks Zhe.
3 is sufficient with the default EC policy. Considering other possible policies, like 10+4,
we can be a little conservative.
ok. Got it.

I think whenever the queued BR is processed in the original logic, same processing will happen
after the change. Will check the code to make sure.
Ok. we can discuss more once you get chance to look at that code piece I posted in my earlier

> Erasure coding: preallocate multiple generation stamps and serialize updates from data
> ------------------------------------------------------------------------------------------------
>                 Key: HDFS-9079
>                 URL: https://issues.apache.org/jira/browse/HDFS-9079
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, HDFS-9079.02.patch,
HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch,
HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, HDFS-9079.11.patch, HDFS-9079.12.patch
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) Applies
new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) Updates block on NN
> {code}
> With multiple streamer threads run in parallel, we need to correctly handle a large number
of possible combinations of interleaved thread events. For example, {{streamer_B}} starts
step 2 in between events {{streamer_A.2}} and {{streamer_A.3}}.
> HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. This JIRA
proposes some further optimizations based on HDFS-9040:
> # We can preallocate GS when NN creates a new striped block group ({{FSN#createNewBlock}}).
For each new striped block group we can reserve {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}}
errors have happened we shouldn't try to further recover anyway.
> # We can use a dedicated event processor to offload the error handling logic from {{DFSStripedOutputStream}},
which is not a long running daemon.
> # We can limit the lifespan of a streamer to be a single block. A streamer ends either
after finishing the current block or when encountering a DN failure.
> With the proposed change, a {{StripedDataStreamer}}'s flow becomes:
> {code}
> 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) =>
> 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) =>
3) Ack from DN => 4) Notify coordinator (async, not waiting for response)
> {code}

This message was sent by Atlassian JIRA

View raw message