Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CAE9F2E95 for ; Thu, 21 Apr 2011 06:17:59 +0000 (UTC) Received: (qmail 99585 invoked by uid 500); 21 Apr 2011 06:17:59 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 99406 invoked by uid 500); 21 Apr 2011 06:17:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 99390 invoked by uid 99); 21 Apr 2011 06:17:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2011 06:17:58 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2011 06:17:44 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id EB136AB318 for ; Thu, 21 Apr 2011 06:17:05 +0000 (UTC) Date: Thu, 21 Apr 2011 06:17:05 +0000 (UTC) From: "Bharath Mundlapudi (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <66083924.72730.1303366625959.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <411101893.68583.1303257725837.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-1848) Datanodes should shutdown when a critical volume fails MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022628#comment-13022628 ] Bharath Mundlapudi commented on HDFS-1848: ------------------------------------------ I am wondering if this is necessary? Typically, critical volume (eg the volume that hosts the OS, logs, pid, tmp dir etc.) is RAID-1 and if this goes down we can safely assume Datanode to be down. I am just curious to understand the usecase? Please refer to Disk Fail Inplace Jira. https://issues.apache.org/jira/browse/HADOOP-7123 In our tests with disk failures, We have verified that if the root/critical volume fails, Datanode can't even start. > Datanodes should shutdown when a critical volume fails > ------------------------------------------------------ > > Key: HDFS-1848 > URL: https://issues.apache.org/jira/browse/HDFS-1848 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: Eli Collins > Fix For: 0.23.0 > > > A DN should shutdown when a critical volume (eg the volume that hosts the OS, logs, pid, tmp dir etc.) fails. The admin should be able to specify which volumes are critical, eg they might specify the volume that lives on the boot disk. A failure in one of these volumes would not be subject to the threshold (HDFS-1161) or result in host decommissioning (HDFS-1847) as the decommissioning process would likely fail. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira