hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Update of "SpMV" by Mikalai Parafeniuk
Date Thu, 19 Jul 2012 09:29:59 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "SpMV" page has been changed by Mikalai Parafeniuk:
http://wiki.apache.org/hama/SpMV?action=diff&rev1=6&rev2=7

   3. Fanin.
  In Fanout phase all processors gets needed v components. In local computation phase local
contribution to result vector is calculated. In Fanin phase all local contributions are sent
to an owner of u. Most of efforts should be taken to choose right matrix and vector distribution
which will improve the comunication volume of Fanout and Fanin phase. As base implementation
of distribution I propose to create Cartesian (column mappings are not dependent of row mappings
and vice versa) cyclic-block distribution with cyclical distribution of matrix diagonal. Also
I assume that distr(u) != distr(v), which gives us more freedom in optimising vector distribution.
This type of distribution has such advantages: it is simple, in fanin only communication with
processor column is needed, in fanout only communication with processor row is needed, we
can easily predict the productivity of algorithm. After matrix distribution we perform vector
distribution in greedy way for each processor: processor grabs items until he reaches it's
optimum state. After that stage some vector components can be unassigned (nearly 10%). We
well distribute them in greedy fashion to. To support efficient local computation used data
structure should provide indeces of rows and columns which have the non-zero item in them.
Local computation must be performed with local indeces.
  
+ === Dealing with large matrices ===
+ Current code contains classes which work with matrix in memory. That's why algorithm will
fail in case of large matrices. So I propose some steps to modify SpMV algorithm to work with
large matrices. First of all, simple matrix format based on work with file system will be
created. Let's call this class FileMatrix. This format will give such possibilities: 
+  1. we can set matrix cell and it will be appended to file, without any check.
+  2. we can iterate through entire file for getting all matrix cells. 
+ Such constraints are chosen because it is hard to imagine, how we can efficiently implement
some matrix operations, for example, get cell with specified index. In the same time this
constraints meets constraints of HDFS (large size of block, data will be written once and
read many times, fast sequential reading of entire file). Creation of such class won't take
much time, and it will be possible to store and read large matrices. The bottleneck in current
algorithm in memory consumption - phase of matrix distribution. Array of local matrices is
stored in memory. We can correct the situation in such way: input matrix is stored in file,
after that we iterate through matrix and map its components to local matrices also presented
as FileMatrix. From now we won't store array of local matrices in memory, in this step we
assume that matrix, taken from local file can fit memory of local processor. But even this
can be improved. When local matrix can't fit local processor memory we will invoke local SpMV
algorithm on matrix parts. I propose to create class, which implements Mapper interface from
linearalgebra package, and split local matrix recursively into chunks presented like FileMatrix
until each chunk can fit local memory. I will call further this process two-phase mapping.
After making such steps, we will avoid storing entire matrix in memory, now we assume that
matrix can fit entire space of hard drives and can store components of local vector in memory.
Also two-phase mapping can be used in RandomMatrixGenerator for large matrices.
+ 
  === Possible improvements ===
   1. Implementation of Mondrian distribution. In most cases it gives better results in comparison
with cyclic-block Cartesian scheme.
   2. Implement algorithm for detecting matrix sparsity patterns. This will give us a possibility
to define, for example, if matrix is random sparse matrix, or Laplacian matrix. In case of
random sparse matrices we can use distribution patterns which are independent of matrix sparsity
pattern. In case of Laplacian matrices we diamond distribution can give better result.

Mime
View raw message