mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Xue <xue....@gmail.com>
Subject Re: An inmemory sparse matrix multiplier
Date Sun, 26 Jun 2011 22:13:01 GMT
NonZero Elements: >1,000,000
Input Matrix Size: 30,000 x 480,000 and 480,000 x 30,000

Reading a matrix is dependent on how many non zero elements there are.
To read each element of the matrix, including non zero, would probably
take just as long as it currently does using the
SparseMatrix.getQuick().

I will test this implementation using larger, denser matrices. So far,
I have been using it only with my design matrix but I will generate
some denser random matrices and test the performance soon.

Vincent

On Sun, Jun 26, 2011 at 10:29 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Regarding speed:
>
> How many non-zero elements?
>
> What is the size of your input matrices?
>
> How long does it take to read the matrices without doing any multiplication?
>
> Your test matrices seem small for big sparse matrices.
>
> This sort of thing could be very useful.
>
> On Sun, Jun 26, 2011 at 1:47 PM, Vincent Xue <xue.vin@gmail.com> wrote:
>
>> Hi.
>>
>> I was wondering how useful an in memory sparse matrix multiplier would
>> be for mahout. In my current project I needed to multiply many large
>> sparse matrices but submitting hundreds of jobs to the cluster creates
>> too much overhead.
>>
>> I wrote up an implementation of sparse matrix multiplication using
>> threads which can multiply a 30,000 x 48,0000 matrix by its transpose
>> in about 5 minutes using 16 cores. Granted this matrix is composed
>> mostly of 1s, and -1s, (with about 16 elements per row), is this
>> considered fast? I have seen that my implementation is much faster
>> than iterating though a matrix naively and would like some input to
>> whether or not my 5 minute benchmark is by skewed.
>>
>> Many thanks for the input,
>> Vincent
>>
>

Mime
View raw message