Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE77E177FF for ; Tue, 17 Feb 2015 06:31:12 +0000 (UTC) Received: (qmail 11922 invoked by uid 500); 17 Feb 2015 06:31:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 11859 invoked by uid 500); 17 Feb 2015 06:31:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 11847 invoked by uid 99); 17 Feb 2015 06:31:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Feb 2015 06:31:12 +0000 Date: Tue, 17 Feb 2015 06:31:12 +0000 (UTC) From: "Yi Liu (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323712#comment-14323712 ] Yi Liu commented on HDFS-7740: ------------------------------ Sorry for the late update. Add tests for the above 4 scenarios. To let this tests free control the datanodes number and don't affect other tests, I use separate MiniDFSCluster for them. Some explanations to the 4 tests: {quote} Create file with 3 DNs up. Kill DN(0). Truncate file. Restart DN(0), make sure the old replica is disregarded and replaced with the truncated one. {quote} For non copy-on-truncate, the new (truncated) block id is the same, but the GS (GenerationStamp) should increase. In the test, I trigger block report for dn0 after it restarts, since the GS of replica for the last block is old on dn0, so the reported last block from dn0 should be marked corrupt on nn and the replicas of last block should decrease 1 on nn, then the truncated block will be replicated to dn0. In the test, I check old replica (the block file and block metatdata file) is removed and replaced with the new (truncated) one. {quote} Kill DN(1). Truncate within the same last block with copy-on-truncate. Restart DN(1), verify replica consistency. {quote} For copy-on-truncate, new block is made with new block id and new GS. In the test, I trigger block report for dn1 after it restarts. The replicas of the new block is 2, and then it's replicated to dn1. In the test, I check new block file is replicated in dn1, and old replica exists too because there is snapshot. {quote} Create a single block file with 3 replicas. Truncate mid of block and then immediately restart 2 of the DNs. Check the files {quote} In the test, I restart dn0 and dn1 immediately after truncate, and check the old replica is removed and replaced with the truncated one on dn0 and dn1. {quote} Same as before except completely shutting down 3 of the DNs but not restarting them. {quote} In the test, I check the truncated block is always under construction after the 3 datanodes shutdown. > Test truncate with DataNodes restarting > --------------------------------------- > > Key: HDFS-7740 > URL: https://issues.apache.org/jira/browse/HDFS-7740 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test > Affects Versions: 2.7.0 > Reporter: Konstantin Shvachko > Assignee: Yi Liu > Fix For: 2.7.0 > > Attachments: HDFS-7740.001.patch > > > Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)