hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jack liuquan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm
Date Tue, 07 Apr 2015 03:05:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482497#comment-14482497
] 

jack liuquan commented on HDFS-7715:
------------------------------------

High level:
1. Please write a comprehensive class header comments about the new code and coder, also acknowledging
the original author's effort.
bq.OK,sure.
2. For now, we need to figure out how to map these raw HH coders to corresponding high level
{{ErasureCoder}}s, if we decide to implement them as raw coders directly;
bq.When do you have available time to make a phone call, I want to disscuss with you by phone,
Thanks.:)
3. Do we have tests for the new coders?
bq.Yes, I have test the news coders, and It's right. But for the 30K limit, I didn't upload
the test codes. Can I upload the test codes alone?

1. Any better name for variable *pb_vec*?
bq.It's named by Rashmi. Maybe Rashmi can give a suggestion. I think *pb_vec* is a index for
storing piggybacks of first sub-stripe, maybe pb_index is ok.
2. Move the codes about computing generating polynomial to HHUtil?
bq.Sounds good, I will do it in new patch.
3. The following variables are not good. Please use numDataUnits, numParityUnits instead for
consistency in all places.
bq.If use numDataUnits, numParityUnits instead for consistency, we need change {{private}}
to {{protected}} of numDataUnits, numParityUnits in {{AbstractRawErasureCoder}}

4. In HHUtil.getPiggyBacksFromInput, the parameter encoder isn't used.
bq. the parameter encoder is used in line 62:
{code}
+    encoder.encode(tempInput, tempOutput);
{code}


> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
>                 Key: HDFS-7715
>                 URL: https://issues.apache.org/jira/browse/HDFS-7715
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: jack liuquan
>         Attachments: 7715-hitchhikerXOR-v2.patch, HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch
>
>
> [Hitchhiker | http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf]
is a new erasure coding algorithm developed as a research project at UC Berkeley. It has been
shown to reduce network traffic and disk I/O by 25%-45% during data reconstruction. This JIRA
aims to introduce Hitchhiker to the HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message