hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Difference between sparse* and dense*
Date Thu, 19 Mar 2009 08:57:39 GMT
If you have any ideas, Pls let us know.

On Wed, Mar 18, 2009 at 10:21 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Oh, good point.
>
> Hbase seems good fit for huge sparse matrcies.
>
> - Non-zero value
> - Index for row and column
>
> However, It's too good for dense matrix. IMO, We can't store the huge
> dense matrix to Hbase. When I store the 5000 * 5000 double matrix with
> row/column/time index to Hbase, 15~16 GB was used for each nodes.
> (replica = 3) So, I made a two implement.  We should survey about data
> structures.
>
> And, There is also a difference of algorithms/benefits between Dense
> and Sparse.
>
> - The blocking algorithm only work for Dense Matrix, And stores all.
> - Sparse Matrix stores only non-zero value (storage efficient) but, If
> sparsity is low, manipulations will have some overhead by irregular
> access through network.
>
> I've start the work for documentation --
> http://wiki.apache.org/hama/Architecture -- Please also review this.
>
> On Wed, Mar 18, 2009 at 8:24 PM, Samuel Guo <guosijie@gmail.com> wrote:
>> Hi all,
>>
>> It seems that DenseVector and SparseVector both use *MapWritable* as the
>> container of vector data. And the methods' implementations of DenseVector &
>> SparseVector are similarly. so why we need two copies of the code?
>>
>> There are same issues in DenseMatrix and SparseMatrix.
>>
>> Regards,
>> Samuel
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message