Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 79329 invoked from network); 9 Oct 2009 22:01:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Oct 2009 22:01:05 -0000 Received: (qmail 25568 invoked by uid 500); 9 Oct 2009 22:01:05 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 25507 invoked by uid 500); 9 Oct 2009 22:01:05 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 25497 invoked by uid 99); 9 Oct 2009 22:01:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Oct 2009 22:01:05 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Oct 2009 22:01:02 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CBD9D234C1F2 for ; Fri, 9 Oct 2009 15:00:32 -0700 (PDT) Message-ID: <2068133541.1255125632833.JavaMail.jira@brutus> Date: Fri, 9 Oct 2009 15:00:32 -0700 (PDT) From: "Robert Chansler (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-457) better handling of volume failure in Data Node storage In-Reply-To: <1817020889.1246388327191.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Chansler updated HDFS-457: --------------------------------- Release Note: Datanode can continue if a volume for replica storage fails. Previously a datanode resigned if any volume failed. (was: Do not shutdown datanode if some, but not all, volumes fail.) Editorial pass over all release notes prior to publication of 0.21. > better handling of volume failure in Data Node storage > ------------------------------------------------------ > > Key: HDFS-457 > URL: https://issues.apache.org/jira/browse/HDFS-457 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: Boris Shkolnik > Assignee: Boris Shkolnik > Fix For: 0.21.0 > > Attachments: HDFS-457-1.patch, HDFS-457-2.patch, HDFS-457-2.patch, HDFS-457-2.patch, HDFS-457-3.patch, HDFS-457.patch, TestFsck.zip > > > Current implementation shuts DataNode down completely when one of the configured volumes of the storage fails. > This is rather wasteful behavior because it decreases utilization (good storage becomes unavailable) and imposes extra load on the system (replication of the blocks from the good volumes). These problems will become even more prominent when we move to mixed (heterogeneous) clusters with many more volumes per Data Node. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.