Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Message-ID: <820047586.1252837137658.JavaMail.jira@brutus>
Date: Sun, 13 Sep 2009 03:18:57 -0700 (PDT)
From: "dhruba borthakur (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Subject: [jira] Updated: (HDFS-503) Implement erasure coding as a layer on
 HDFS
In-Reply-To: <242063006.1248420974911.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HDFS-503:
----------------------------------

    Attachment: raid2.txt

Incorporated a few review comments:

1. Make the underlying filesystem configurable (the default is till DistributedFileSystem)
2. The sample raid.xml lists the configuration properties that are exposed to the adminstrator.

@Nicolas: I created a separate JIRA  HDFS-600 to make the Parity generation algorithm pluggable. I will like to address it in a separate patch. This is going to play a critical part if we want to reduce the physical replication factor even more.

@Andrew: I created HDFS-582 to implement a command line utility called  fsckraid. It will periodically verify parity bits.

@Raghu, you mentioned that "this only semi-transparent to the users since they have to use the new filesystem". In most cases, the cluster administrator sets the value of fs.hdfs.impl to DistributedRaidFileSystem, and no user and/or aplications need to change to use this raid feature.... that is what I meant by saying that this is "transparent" to the user. I also immensely like your idea of making the RaidNode fetch a list of corrupt blocks from the NN. As far as I know, such an API does not exist in the NN. I will open a new JIRA that retrieves a list of missing blocks from the NN.

Thanks everybody for their review comments.

> Implement erasure coding as a layer on HDFS
> -------------------------------------------
>
>                 Key: HDFS-503
>                 URL: https://issues.apache.org/jira/browse/HDFS-503
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: raid1.txt, raid2.txt
>
>
> The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file system can be reduced. Keeping three copies of the same data is very costly, especially when the size of storage is huge. One idea is to reduce the replication factor and do erasure coding of a set of blocks so that the over probability of failure of a block remains the same as before.
> Many forms of error-correcting codes are available, see http://en.wikipedia.org/wiki/Erasure_code. Also, recent research from CMU has described DiskReduce https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
> My opinion is to discuss implementation strategies that are not part of base HDFS, but is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.