After commit HAMA142, I finally fulfilled the multiplication of
10,000 * 10,000 dense matrices. I am gratified with this result. But,
there is a lot of netsent/netreceived bytes between master and slaves
and overhead of read operation in a loop during multiplication.
BTW, blocked dense matrix have small rows. Hence, It doesn't
horizontally spread to each machine.
09/01/07 17:36:14 INFO mapred.TableInputFormatBase: split:
0>d8g053.nhncorp.com:,000000000000,0,10
09/01/07 17:36:14 INFO mapred.TableInputFormatBase: split:
1>d8g053.nhncorp.com:000000000000,0,10,
/Edward
On Tue, Jan 6, 2009 at 2:11 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Oh, sorry. It's 8 GB.
>
> On Tue, Jan 6, 2009 at 2:05 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
>> Let's assume matrix a * b of 10,000 * 10,000 dense matrices,
>>
>> 5 * 5 blocks,
>> 1 block is 2000 * 2000 and 16 MB,
>>
>> 0 : c(0, 0) += a(0, 0) * b(0, 0)
>> 1 : c(0, 1) += a(0, 0) * b(0, 1)
>> ...
>> 123 : c(4, 3) += a(4, 4) * b(4, 3)
>> 124 : c(4, 4) += a(4, 4) * b(4, 4)
>>
>> 5^3 * 32 MB = 4 GB.
>>
>> collection table size is 4 GB. Anyway, let's try it.
>>
>> On Tue, Jan 6, 2009 at 12:37 PM, Samuel Guo <guosijie@gmail.com> wrote:
>>> +1
>>> hmm, it is tricky.
>>>
>>> On Tue, Jan 6, 2009 at 11:04 AM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>>>
>>>> If we collect blocks to one table during blocking_mapred(), locality
>>>> will be provided and more faster.
>>>>
>>>> row Key column:A column:B
>>>> c(0, 0) += a(0, 0) * b(0, 0)
>>>> c(0, 0) += a(0, 1) * b(1, 0)
>>>> c(0, 0) += a(0, 2) * b(2, 0)
>>>> c(0, 0) += a(0, 3) * b(3, 0)
>>>> c(0, 1) += a(0, 0) * b(0, 1)
>>>> c(0, 1) += a(0, 1) * b(1, 1)
>>>> ...
>>>>
>>>> What do you think?
>>>>
>>>> On Mon, Jan 5, 2009 at 10:30 AM, Edward J. Yoon <edwardyoon@apache.org>
>>>> wrote:
>>>> > Hama Trunk doesn't work for large matrices multiplication with
>>>> > mapred.task.timeout and scanner.timeout exception. I tried 1,000,000
*
>>>> > 1,000,000 matrix multiplication on 100 node. (Rests are good)
>>>> >
>>>> > To reduce read operation of duplicated block, I thought as describe
>>>> > below. But, each map processing seems too large.
>>>> >
>>>> > 
>>>> > // c[i][k] += a[i][j] * b[j][k];
>>>> >
>>>> > map() {
>>>> > SubMatrix a = value.get();
>>>> >
>>>> > for (RowResult row : scan) {
>>>> > collect : c[i][k] = a * b[j][k];
>>>> > }
>>>> > }
>>>> >
>>>> > reduce() {
>>>> > c[i][k] += c[i][k];
>>>> > }
>>>> > 
>>>> >
>>>> > Should we increase {mapred.task.timeout and scanner.timeout}?
>>>> > or any good idea?
>>>> >
>>>> > 
>>>> > Best Regards, Edward J. Yoon @ NHN, corp.
>>>> > edwardyoon@apache.org
>>>> > http://blog.udanax.org
>>>> >
>>>>
>>>>
>>>>
>>>> 
>>>> Best Regards, Edward J. Yoon @ NHN, corp.
>>>> edwardyoon@apache.org
>>>> http://blog.udanax.org
>>>>
>>>
>>
>>
>>
>> 
>> Best Regards, Edward J. Yoon @ NHN, corp.
>> edwardyoon@apache.org
>> http://blog.udanax.org
>>
>
>
>
> 
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>

Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org
