hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramkumar Vadali (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-503) Implement erasure coding as a layer on HDFS
Date Mon, 04 Oct 2010 05:49:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917481#action_12917481
] 

Ramkumar Vadali commented on HDFS-503:
--------------------------------------

@shravankumar, to get a basic idea of HDFS RAID, you can read up Dhruba's blog post http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html

If you need this for demo purposes, could you use the current hadoop trunk? I am not sure
about the exact date of the next release. 
To use RAID, you need to create a configuration file and start the RAID daemon. You can look
for examples in the unit tests, say TestRaidNode.


For further communication, you can contact me directly.

> Implement erasure coding as a layer on HDFS
> -------------------------------------------
>
>                 Key: HDFS-503
>                 URL: https://issues.apache.org/jira/browse/HDFS-503
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: contrib/raid
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.21.0
>
>         Attachments: raid1.txt, raid2.txt
>
>
> The goal of this JIRA is to discuss how the cost of raw storage for a HDFS file system
can be reduced. Keeping three copies of the same data is very costly, especially when the
size of storage is huge. One idea is to reduce the replication factor and do erasure coding
of a set of blocks so that the over probability of failure of a block remains the same as
before.
> Many forms of error-correcting codes are available, see http://en.wikipedia.org/wiki/Erasure_code.
Also, recent research from CMU has described DiskReduce https://opencirrus.org/system/files/Gibson-OpenCirrus-June9-09.ppt.
> My opinion is to discuss implementation strategies that are not part of base HDFS, but
is a layer on top of HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message