hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline
Date Wed, 21 Oct 2015 21:32:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967978#comment-14967978
] 

Jing Zhao commented on HDFS-9098:
---------------------------------

Thanks for working on this, Zhe! The proposed approach looks pretty good to me. Some early
comments:
# As you mentioned we need both sync points and fault injector. Thus the following setting-error-state
code can be replaced by fault injector.
{code}
551	        if (DFSClientSyncPointInjector.getInstance().
552	            syncPoint(SyncPointType.DN_FAILURE)) {
553	          getErrorState().setInternalError();
554	          getErrorState().markFirstNodeIfNotMarked();
555	        }
{code}
# We can also consider mapping sync events to different fault injectors. In the above example,
the DN failure can be an instant failure or a timeout failure. Then maybe we can let streamer#1
hit a instant failure while streamer#2 hit a timeout.
# {{writeChunk}} is a complicated process thus we can consider adding multiple finer-grained
sync points for it.
# It will be helpful if we can define some simple scripts to express the test cases. The scripts
finally can also be generated automatically. But this can be done separately.

> Erasure coding: emulate race conditions among striped streamers in write pipeline
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-9098
>                 URL: https://issues.apache.org/jira/browse/HDFS-9098
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9098.wip.patch
>
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very tricky
to handle. [~walter.k.su] and [~jingzhao] have discussed several race conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message