Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4686818C3D for ; Wed, 17 Feb 2016 07:24:31 +0000 (UTC) Received: (qmail 45079 invoked by uid 500); 17 Feb 2016 07:24:18 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 44951 invoked by uid 500); 17 Feb 2016 07:24:18 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 44928 invoked by uid 99); 17 Feb 2016 07:24:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2016 07:24:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0B3E42C14F0 for ; Wed, 17 Feb 2016 07:24:18 +0000 (UTC) Date: Wed, 17 Feb 2016 07:24:18 +0000 (UTC) From: "Lin Yiqun (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-9819) FsVolume should tolerate few times check-dir failed due to deletion by mistake MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Lin Yiqun created HDFS-9819: ------------------------------- Summary: FsVolume should tolerate few times check-dir failed due to deletion by mistake Key: HDFS-9819 URL: https://issues.apache.org/jira/browse/HDFS-9819 Project: Hadoop HDFS Issue Type: Bug Reporter: Lin Yiqun Assignee: Lin Yiqun Fix For: 2.7.1 FsVolume should tolerate few times check-dir failed because sometimes we will do a delete dir/file operation by mistake in datanode data-dirs. Then the {{DataNode#startCheckDiskErrorThread}} will invoking checkDir method periodicity and find dir not existed, throw exception. The checked volume will be added to failed volume list. The blocks on this volume will be replicated again. But actually, this is not needed to do. We should let volume can be tolerated few times check-dir failed like config {{dfs.datanode.failed.volumes.tolerated}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)