hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jack liuquan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm
Date Fri, 03 Apr 2015 02:56:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393991#comment-14393991
] 

jack liuquan commented on HDFS-7715:
------------------------------------

Hi Kai,
As we all know, the advantage of hitchhiker is reduction of data required during the reconstruction
of one data unit missing.
To implement Hitchhiker encoder/decoder for block group in coder path, I think, the only different
is the ECChunk[] inputChunks in performCoding() of decoder.
Can you tell me how the inputBlocks point to the real HDFS blocks and fetch the data to the
inputChunks?
Because hitchhiker divided one block into two stripes to encode/decode, I think maybe we can
add a new HitchhikerBlock class extends ECBlock to directive which sub-stripe of block to
fetch for hitchhiker decoding.

eg:
public class HitchhikerBlock extends ECBlock{

  private int substripe; 
  
  //if substripe=0, fetch whole HDFS block data which this ECBlock point to,
  //if substripe=1, fetch the first sub-stripe of HDFS block which this ECBlock point to,
  //if substripe=2, fetch the second sub-stripe of HDFS block which this ECBlock point to.
  
Thanks a lot if you can give me some suggestions.:)

> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
>                 Key: HDFS-7715
>                 URL: https://issues.apache.org/jira/browse/HDFS-7715
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: jack liuquan
>         Attachments: 7715-hitchhikerXOR-v2.patch, HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch
>
>
> [Hitchhiker | http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf]
is a new erasure coding algorithm developed as a research project at UC Berkeley. It has been
shown to reduce network traffic and disk I/O by 25%-45% during data reconstruction. This JIRA
aims to introduce Hitchhiker to the HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message