hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8889) Erasure Coding: cover more test situations of datanode failure during client writing
Date Mon, 14 Sep 2015 20:25:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744191#comment-14744191
] 

Zhe Zhang commented on HDFS-8889:
---------------------------------

Thanks for the work Bo. It's a great idea to test the write pipeline error handling more systematically.
I just moved this JIRA to follow-on together with other write pipeline JIRAs.

> Erasure Coding: cover more test situations of datanode failure during client writing
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-8889
>                 URL: https://issues.apache.org/jira/browse/HDFS-8889
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: HDFS-8889-HDFS-7285-001.patch
>
>
> Currently 9 streamers are working together for the client writing. A small number of
failed datanodes (<= 3) for a block group should not influence the writing. There’re
a lot of datanode failure cases and we should cover as many as possible in unit test.
> Suppose streamer 4 fails, the following situations for the next block group should be
considered:
> 1)	all streamers succeed
> 2)	Streamer 4 still fails
> 3)	only streamer 1 fails
> 4)	only streamer 8 fails (test parity streamer)
> 5)	streamer 4 and 6 fail
> 6)	streamer 4 and 1,6 fail
> 7)	streamer 4 and 1,2,6 fail
> 8)	streamer 2, 6 fail
> Suppose streamer 2 and 4 fail, the following situations for the next block group should
be considered:
> 1)	only streamer 2 and 4 fail
> 2)	streamer 2, 4, 8 fail
> 3)	only streamer 2 fails
> 4)	streamer 3 , 8 fail
> For a single streamer, we should consider the following situations of the time of datanode
failure:
> 1)	before writing the first byte
> 2)	before finishing writing the first cell
> 3)	right after finishing writing the first cell
> 4)	before writing the last byte of the block
> Other situations:
> 1)	more than 3 streamers fail at the first block group
> 2)	more than 3 streamers fail at the last block group
> <more …>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message