# incubator-hama-commits mailing list archives

##### Site index · List index
Message view
Top
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Trivial Update of "MatMult" by udanax
Date Fri, 13 Mar 2009 01:49:27 GMT
```Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/hama/MatMult

------------------------------------------------------------------------------

== Multiplication example of file (dense) matrices on HDFS ==

- Let's assume we have an 4 by 4 matrices A and B as a sequence file.
+ Let's assume we have an 4,000 by 4,000 matrices A and B as a sequence file.

{{{
K : !IntWritable
@@ -29, +29 @@

It can be represented as below:

-   0 : (0, 0.34), (1, 0.52), (2, 0.12), (3, 0.56)
+   0 : (0, 0.34), (1, 0.52), (2, 0.12), (3, 0.56) ...
-   1 : (0, 0.74), (1, 0.25), (2, 0.44), (3, 0.12)
+   1 : (0, 0.74), (1, 0.25), (2, 0.44), (3, 0.12) ...
..
-   3 : (0, 0.24), (1, 0.48), (2, 0.32), (3, 0.46)
+   3999 : (0, 0.24), (1, 0.48), (2, 0.32), (3, 0.46) ...
}}}

- We collect the blocks (sub-matrix) to 'collectionTable' firstly using !MyMapper. It used
to minimize data movement and network cost.

+ To mutliply two dense matrices A and B, We collect the blocks to 'collectionTable' firstly
using map/reduce. Rows are named as c(i, j) with sequential number ((N^2 * i) + ((j * N) +
k) to avoid duplicated records. Each row has a two sub matrices of a(i, k) and b(k, j) so
that minimized data movement and network cost. Finally, We multiply and sum sequentially.
-  * Map task takes <Row, <Column, Entry>>
-   * Emit <BlockID, !SubVector> along through iterations

+ {{{
+ Blocking jobs:
-  * Reduce task gets (BlockID, !SubVector*)
-   * Merge vectors into a Block
-   * Emit (BlockID, Block)

- [http://lh5.ggpht.com/_DBxyBGtfa3g/SXAixuOid_I/AAAAAAAAAr0/w-_KhIMSOC0/s800/mat-mult.PNG]
+ Collect the blocks to 'collectionTable' from A and B.

- Finally, We multiply and sum sequentially using !BlockMultiplyMap/Reduce.
+ - A map task receives a row n as a key, and vector of each row as its value
+  - emit (blockID, sub-vector)
+ - Reduce task merges block structures based on the information of blockID
+
+ Multiplication job:
+
+ - A map task receives a blockID n as a key, and two sub-matrices of A and B as its value
+ - Reduce task computes sum of blocks
+ }}}

=== See a full example code ===

```
Mime
View raw message