From hdfs-issues-return-249939-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Fri Feb 1 10:53:47 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 32C66180627 for ; Fri, 1 Feb 2019 11:53:47 +0100 (CET) Received: (qmail 15647 invoked by uid 500); 1 Feb 2019 10:53:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 15629 invoked by uid 99); 1 Feb 2019 10:53:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Feb 2019 10:53:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D0A45C806E for ; Fri, 1 Feb 2019 10:53:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Lo34KVzIWH7O for ; Fri, 1 Feb 2019 10:53:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2282760E14 for ; Fri, 1 Feb 2019 10:37:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B22C7E0234 for ; Fri, 1 Feb 2019 10:37:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4BD23243D1 for ; Fri, 1 Feb 2019 10:37:00 +0000 (UTC) Date: Fri, 1 Feb 2019 10:37:00 +0000 (UTC) From: "Hadoop QA (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758195#comment-16758195 ] Hadoop QA commented on HDDS-935: -------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 45s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 23s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 45m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.client.rpc.TestOzoneRpcClient | | | hadoop.ozone.container.keyvalue.TestKeyValueHandler | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HDDS-935 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12957236/HDDS-935.002.patch | | Optional Tests | asflicense unit javac javadoc findbugs checkstyle | | uname | Linux 0f21a8d2b931 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDDS-Build@2/ozone.sh | | git revision | trunk / 13aa939 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | unit | https://builds.apache.org/job/PreCommit-HDDS-Build/2164/artifact/out/patch-unit-hadoop-ozone.txt | | unit | https://builds.apache.org/job/PreCommit-HDDS-Build/2164/artifact/out/patch-unit-hadoop-hdds.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDDS-Build/2164/testReport/ | | Max. process+thread count | 1105 (vs. ulimit of 10000) | | modules | C: hadoop-hdds/common hadoop-hdds/container-service hadoop-ozone/integration-test hadoop-ozone/tools U: . | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/2164/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart > -------------------------------------------------------------------------------------------------------------- > > Key: HDDS-935 > URL: https://issues.apache.org/jira/browse/HDDS-935 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode > Affects Versions: 0.4.0 > Reporter: Rakesh R > Assignee: Shashikant Banerjee > Priority: Major > Attachments: HDDS-935.000.patch, HDDS-935.001.patch, HDDS-935.002.patch > > > Currently, a container gets created when a writeChunk request comes to HddsDispatcher and if the container does not exist already. In case a disk on which a container exists gets removed and datanode restarts and now, if a writeChunkRequest comes , it might end up creating the same container again with an updated BCSID as it won't detect the disk is removed. This won't be detected by SCM as well as it will have the latest BCSID. This Jira aims to address this issue. > The proposed fix would be to persist the all the containerIds existing in the containerSet when a ratis snapshot is taken in the snapshot file. If the disk is removed and dn gets restarted, the container set will be rebuild after scanning all the available disks and the the container list stored in the snapshot file will give all the containers created in the datanode. The diff between these two will give the exact list of containers which were created but were not detected after the restart. Any writeChunk request now should validate the container Id from the list of missing containers. Also, we need to ensure container creation does not happen as part of applyTransaction of writeChunk request in Ratis. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org