incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <>
Subject Re: Difference between sparse* and dense*
Date Wed, 18 Mar 2009 13:21:31 GMT
Oh, good point.

Hbase seems good fit for huge sparse matrcies.

- Non-zero value
- Index for row and column

However, It's too good for dense matrix. IMO, We can't store the huge
dense matrix to Hbase. When I store the 5000 * 5000 double matrix with
row/column/time index to Hbase, 15~16 GB was used for each nodes.
(replica = 3) So, I made a two implement.  We should survey about data

And, There is also a difference of algorithms/benefits between Dense
and Sparse.

- The blocking algorithm only work for Dense Matrix, And stores all.
- Sparse Matrix stores only non-zero value (storage efficient) but, If
sparsity is low, manipulations will have some overhead by irregular
access through network.

I've start the work for documentation -- -- Please also review this.

On Wed, Mar 18, 2009 at 8:24 PM, Samuel Guo <> wrote:
> Hi all,
> It seems that DenseVector and SparseVector both use *MapWritable* as the
> container of vector data. And the methods' implementations of DenseVector &
> SparseVector are similarly. so why we need two copies of the code?
> There are same issues in DenseMatrix and SparseMatrix.
> Regards,
> Samuel

Best Regards, Edward J. Yoon

View raw message