incubator-hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Trivial Update of "MatMult" by udanax
Date Fri, 13 Mar 2009 01:49:27 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/hama/MatMult

------------------------------------------------------------------------------
  
  == Multiplication example of file (dense) matrices on HDFS ==
  
- Let's assume we have an 4 by 4 matrices A and B as a sequence file. 
+ Let's assume we have an 4,000 by 4,000 matrices A and B as a sequence file. 
  
  {{{
    K : !IntWritable
@@ -29, +29 @@

  
  It can be represented as below:
  
-   0 : (0, 0.34), (1, 0.52), (2, 0.12), (3, 0.56)
+   0 : (0, 0.34), (1, 0.52), (2, 0.12), (3, 0.56) ...
-   1 : (0, 0.74), (1, 0.25), (2, 0.44), (3, 0.12)
+   1 : (0, 0.74), (1, 0.25), (2, 0.44), (3, 0.12) ...
    ..
-   3 : (0, 0.24), (1, 0.48), (2, 0.32), (3, 0.46)
+   3999 : (0, 0.24), (1, 0.48), (2, 0.32), (3, 0.46) ...
  }}}
  
- We collect the blocks (sub-matrix) to 'collectionTable' firstly using !MyMapper. It used
to minimize data movement and network cost.
  
+ To mutliply two dense matrices A and B, We collect the blocks to 'collectionTable' firstly
using map/reduce. Rows are named as c(i, j) with sequential number ((N^2 * i) + ((j * N) +
k) to avoid duplicated records. Each row has a two sub matrices of a(i, k) and b(k, j) so
that minimized data movement and network cost. Finally, We multiply and sum sequentially.
-  * Map task takes <Row, <Column, Entry>>
-   * Emit <BlockID, !SubVector> along through iterations
  
+ {{{
+ Blocking jobs:
-  * Reduce task gets (BlockID, !SubVector*) 
-   * Merge vectors into a Block
-   * Emit (BlockID, Block)
  
- [http://lh5.ggpht.com/_DBxyBGtfa3g/SXAixuOid_I/AAAAAAAAAr0/w-_KhIMSOC0/s800/mat-mult.PNG]
+ Collect the blocks to 'collectionTable' from A and B.
  
- Finally, We multiply and sum sequentially using !BlockMultiplyMap/Reduce.
+ - A map task receives a row n as a key, and vector of each row as its value
+  - emit (blockID, sub-vector)
+ - Reduce task merges block structures based on the information of blockID
+ 
+ Multiplication job:
+ 
+ - A map task receives a blockID n as a key, and two sub-matrices of A and B as its value
+ - Reduce task computes sum of blocks
+ }}}
  
  === See a full example code ===
  

Mime
View raw message