incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Large matrices multiplication problem.
Date Tue, 06 Jan 2009 05:11:17 GMT
Oh, sorry. It's 8 GB.

On Tue, Jan 6, 2009 at 2:05 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Let's assume matrix a * b of 10,000 * 10,000 dense matrices,
>
> 5 * 5 blocks,
> 1 block is 2000 * 2000 and 16 MB,
>
> 0 : c(0, 0) += a(0, 0) * b(0, 0)
> 1 : c(0, 1) += a(0, 0) * b(0, 1)
> ...
> 123 : c(4, 3) += a(4, 4) * b(4, 3)
> 124 : c(4, 4) += a(4, 4) * b(4, 4)
>
> 5^3 * 32 MB = 4 GB.
>
> collection table size is 4 GB. Anyway, let's try it.
>
> On Tue, Jan 6, 2009 at 12:37 PM, Samuel Guo <guosijie@gmail.com> wrote:
>> +1
>> hmm, it is tricky.
>>
>> On Tue, Jan 6, 2009 at 11:04 AM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>>
>>> If we collect blocks to one table during blocking_mapred(), locality
>>> will be provided and more faster.
>>>
>>> row Key   column:A   column:B
>>> c(0, 0) += a(0, 0) * b(0, 0)
>>> c(0, 0) += a(0, 1) * b(1, 0)
>>> c(0, 0) += a(0, 2) * b(2, 0)
>>> c(0, 0) += a(0, 3) * b(3, 0)
>>> c(0, 1) += a(0, 0) * b(0, 1)
>>> c(0, 1) += a(0, 1) * b(1, 1)
>>> ...
>>>
>>> What do you think?
>>>
>>> On Mon, Jan 5, 2009 at 10:30 AM, Edward J. Yoon <edwardyoon@apache.org>
>>> wrote:
>>> > Hama Trunk doesn't work for large matrices multiplication with
>>> > mapred.task.timeout and scanner.timeout exception. I tried 1,000,000 *
>>> > 1,000,000 matrix multiplication on 100 node. (Rests are good)
>>> >
>>> > To reduce read operation of duplicated block, I thought as describe
>>> > below. But, each map processing seems too large.
>>> >
>>> > ----
>>> > // c[i][k] += a[i][j] * b[j][k];
>>> >
>>> > map() {
>>> >  SubMatrix a = value.get();
>>> >
>>> >  for (RowResult row : scan) {
>>> >     collect : c[i][k] = a * b[j][k];
>>> >  }
>>> > }
>>> >
>>> > reduce() {
>>> >  c[i][k] += c[i][k];
>>> > }
>>> > ----
>>> >
>>> > Should we increase {mapred.task.timeout and scanner.timeout}?
>>> > or any good idea?
>>> >
>>> > --
>>> > Best Regards, Edward J. Yoon @ NHN, corp.
>>> > edwardyoon@apache.org
>>> > http://blog.udanax.org
>>> >
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon @ NHN, corp.
>>> edwardyoon@apache.org
>>> http://blog.udanax.org
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message