Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 46282 invoked from network); 13 Sep 2009 10:19:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Sep 2009 10:19:28 -0000 Received: (qmail 60560 invoked by uid 500); 13 Sep 2009 10:19:28 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 60503 invoked by uid 500); 13 Sep 2009 10:19:28 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 60493 invoked by uid 99); 13 Sep 2009 10:19:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Sep 2009 10:19:28 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Sep 2009 10:19:17 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A4CD0234C045 for ; Sun, 13 Sep 2009 03:18:57 -0700 (PDT) Message-ID: <820047586.1252837137658.JavaMail.jira@brutus> Date: Sun, 13 Sep 2009 03:18:57 -0700 (PDT) From: "dhruba borthakur (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-503) Implement erasure coding as a layer on HDFS In-Reply-To: <242063006.1248420974911.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-503: ---------------------------------- Attachment: raid2.txt Incorporated a few review comments: 1. Make the underlying filesystem configurable (the default is till DistributedFileSystem) 2. The sample raid.xml lists the configuration properties that are exposed to the adminstrator. @Nicolas: I created a separate JIRA HDFS-600 to make the Parity generation algorithm pluggable. I will like to address it in a separate patch. This is going to play a critical part if we want to reduce the physical replication factor even more. @Andrew: I created HDFS-582 to implement a command line utility called fsckraid. It will periodically verify parity bits. @Raghu, you mentioned that "this only semi-transparent to the users since they have to use the new filesystem". In most cases, the cluster administrator sets the value of fs.hdfs.impl to DistributedRaidFileSystem, and no user and/or aplications need to change to use this raid feature.... that is what I meant by saying that this is "transparent" to the user. I also immensely like your idea of making the RaidNode fetch a list of corrupt blocks from the NN. As far as I know, such an API does not exist in the NN. I will open a new JIRA that retrieves a list of missing blocks from the NN. Thanks everybody for their review comments. > Implement erasure coding as a layer on HDFS > ------------------------------------------- > > Key: HDFS-503 > URL: https://issues.apache.org/jira/browse/HDFS-503 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: raid1.txt, raid2.txt > > > The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file system can be reduced. Keeping three copies of the same data is very costly, especially when the size of storage is huge. One idea is to reduce the replication factor and do erasure coding of a set of blocks so that the over probability of failure of a block remains the same as before. > Many forms of error-correcting codes are available, see http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has described DiskReduce https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt. > My opinion is to discuss implementation strategies that are not part of base HDFS, but is a layer on top of HDFS. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.