hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jack liuquan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm
Date Mon, 16 Mar 2015 02:22:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362660#comment-14362660
] 

jack liuquan commented on HDFS-7715:
------------------------------------

Hi Rashmi,
Thanks for your respones.
I make a code modification of remainderWithPBCalcXOR() in GaloisField.java,
and make Hitchhiker-XOR+ not relying the all-XOR-parity property of underlying RS code.
Can you review my code below and check whether it is OK ?
Thank you very much!

===========
  public void remainderWithPBCalcXOR(byte[][] dividend, int[] divisor, byte[][] piggys, 
  		int[] piggyBackSetSizes, int pb_vec, int stripeSize, int paritySize) { 
  		...
  		...
  		
  		// add for hitchhiker-XOR+, no need the all-XOR-parity property.
  		// calc all-XOR of all first substrip data units
  		int piggys_pbvec = 0;
  		for (int m = paritySize; m < dividend.length; ++m) {
  			piggys_pbvec = (piggys_pbvec ^ dividend[m][k]);
  		}
  		
  		for (int i = dividend.length - divisor.length; i >= 0; i--) {
  			for (int j = 0; j < divisor.length; j++) {  				
  				//calculate the parities
  				int ratio = 
  						divTable[dividend[i + divisor.length - 1][k] & 0x00FF][divisor[divisor.length
- 1]];
  				dividend[j + i][k] = (byte)((dividend[j + i][k] & 0x00FF) ^ mulTable[ratio][divisor[j]]);
  			}
  		}
  		
  		dividend[pb_vec][k] = (byte)piggys_pbvec;
  	}

This modification may sacrifice a little performance.

> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
>                 Key: HDFS-7715
>                 URL: https://issues.apache.org/jira/browse/HDFS-7715
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: jack liuquan
>
> [Hitchhiker | http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf]
is a new erasure coding algorithm developed as a research project at UC Berkeley. It has been
shown to reduce network traffic and disk I/O by 25%-45% during data reconstruction. This JIRA
aims to introduce Hitchhiker to the HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message