Return-Path: Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: (qmail 58252 invoked from network); 16 Dec 2010 00:07:23 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Dec 2010 00:07:23 -0000 Received: (qmail 64910 invoked by uid 500); 16 Dec 2010 00:07:23 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 64840 invoked by uid 500); 16 Dec 2010 00:07:22 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 64832 invoked by uid 99); 16 Dec 2010 00:07:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 00:07:22 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 00:07:21 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oBG071uY024128 for ; Thu, 16 Dec 2010 00:07:01 GMT Message-ID: <28903378.148761292458021272.JavaMail.jira@thor> Date: Wed, 15 Dec 2010 19:07:01 -0500 (EST) From: "dhruba borthakur (JIRA)" To: hdfs-dev@hadoop.apache.org Subject: [jira] Created: (HDFS-1539) prevent data loss when a cluster suffers a power loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 prevent data loss when a cluster suffers a power loss ----------------------------------------------------- Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.