hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HDFS-RAID" by ScottChen
Date Thu, 28 Oct 2010 01:27:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HDFS-RAID" page has been changed by ScottChen.
The comment on this change is: Add contents about ErasureCode.
http://wiki.apache.org/hadoop/HDFS-RAID?action=diff&rev1=3&rev2=4

--------------------------------------------------

   * the RaidNode, a daemon that creates and maintains parity files for all data files stored
in the DRFS,
   * the BlockFixer, which periodically recomputes blocks that have been lost or corrupted,
   * the RaidShell utility, which allows the administrator to manually trigger the recomputation
of missing or corrupt blocks and to check for files that have become irrecoverably corrupted.
+  * the ErasureCode, which provides the encode and decode of the bytes in * blocks
  
  === DRFS client ===
  
@@ -70, +71 @@

  recomputation of bad data blocks and also allows the administrator to display a list of
irrecoverable files (i.e., files for which too
  many data or parity blocks have been lost).
  
+ === ErasureCode ===
+ 
+ (currently under development)
+ 
+ ErasureCode is the underlying component used by BlockFixer and RaiNode to generate parity
blocks and to fix parity/source blocks. 
+ ErasureCode does encode and decode. When encoding, ErasureCode takes 
+ several source bytes and generate some parity bytes. When decoding, ErasureCode generates
+ the missing bytes (can be parity or source bytes) by looking at the remaining source bytes
and parity bytes.
+ 
+ The number of missing bytes can be recovered is equal to the number of parity bytes created.
For example, if we encode 10 source bytes to 3 parity
+ bytes. We can recover any 3 missing bytes by the other 10 remaining bytes.
+ 
+ There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code.
The difference between them is that XOR only allows creating one parity
+ bytes but Reed-Solomon code allows creating any given number of parity bytes.
  
  == Using HDFS RAID ==
  

Mime
View raw message