Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Tue, 17 Feb 2015 06:31:12 +0000 (UTC)
From: "Yi Liu (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12772834.1423172389000.60933.1424154672392@Atlassian.JIRA>
In-Reply-To: <JIRA.12772834.1423172389000@Atlassian.JIRA>
References: <JIRA.12772834.1423172389000@Atlassian.JIRA>
 <JIRA.12772834.1423172389672@arcas>
Subject: [jira] [Commented] (HDFS-7740) Test truncate with DataNodes
 restarting
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323712#comment-14323712 ] 

Yi Liu commented on HDFS-7740:
------------------------------

Sorry for the late update.
Add tests for the above 4 scenarios. To let this tests free control the datanodes number and don't affect other tests, I use separate MiniDFSCluster for them.

Some explanations to the 4 tests:
{quote}
Create file with 3 DNs up. Kill DN(0). Truncate file. Restart DN(0), make sure the old replica is disregarded and replaced with the truncated one.
{quote}
For non copy-on-truncate, the new (truncated) block id is the same, but the GS (GenerationStamp) should increase. In the test, I trigger block report for dn0 after it restarts, since the GS of replica for the last block is old on dn0, so the reported last block from dn0 should be marked corrupt on nn and the replicas of last block should decrease 1 on nn, then the truncated block will be replicated to dn0. In the test, I check old replica (the block file and block metatdata file) is removed and replaced with the new (truncated) one.

{quote}
Kill DN(1). Truncate within the same last block with copy-on-truncate. Restart DN(1), verify replica consistency.
{quote}
For copy-on-truncate, new block is made with new block id and new GS. In the test, I trigger block report for dn1 after it restarts. The replicas of the new block is 2, and then it's replicated to dn1. In the test, I check new block file is replicated in dn1, and old replica exists too because there is snapshot.

{quote}
Create a single block file with 3 replicas. Truncate mid of block and then immediately restart 2 of the DNs. Check the files
{quote}
In the test, I restart dn0 and dn1 immediately after truncate, and check the old replica is removed and replaced with the truncated one on dn0 and dn1.

{quote}
Same as before except completely shutting down 3 of the DNs but not restarting them.
{quote}
In the test, I check the truncated block is always under construction after the 3 datanodes shutdown.

> Test truncate with DataNodes restarting
> ---------------------------------------
>
>                 Key: HDFS-7740
>                 URL: https://issues.apache.org/jira/browse/HDFS-7740
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: 2.7.0
>            Reporter: Konstantin Shvachko
>            Assignee: Yi Liu
>             Fix For: 2.7.0
>
>         Attachments: HDFS-7740.001.patch
>
>
> Add a test case, which ensures replica consistency when DNs are failing and restarting.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)