incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: blocking_mapred() speed
Date Thu, 11 Dec 2008 08:44:54 GMT
> I think 1) may be better than 2).
> An InputFormat can get the locality of a range of table to let MR know how
> to move the mr computations close to it.
> In 2), if we do it like RandomMatrixMap, we may lose some locality
> informations of the table. so that the network transfer overhead may be
> increase.

Yes, I agree with you.

On Thu, Dec 11, 2008 at 5:30 PM, Samuel Guo <guosijie@gmail.com> wrote:
> On Thu, Dec 11, 2008 at 2:36 PM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> If we remove 'reduce phase', I guess we can reduce the disk I/O operations.
>
>
> Yes.
>
>
>>
>>
>> In the map, read { Constants.BLOCK_STARTROW, Constants.BLOCK_ENDROW,
>> Constants.BLOCK_STARTCOLUMN, Constants.BLOCK_ENDCOLUMN } instead of {
>> Constants.COLUMN }, and write directly blocks.
>
>
> Two methods to be considered:
> 1) We need a InputFormat that partitions the matrix table according to the
> row boundaries of the blocks.
>    This should be carefully to make sure a single block will not divied
> into two or more mappers.
>
> 2) Like what RandomMatrixMap does, we just tell the mappers the row/column
> boundaries of the blocks of a matrix-table.
>    Scanner the portion of the table will be done in a mapper.
>
> I think 1) may be better than 2).
> An InputFormat can get the locality of a range of table to let MR know how
> to move the mr computations close to it.
> In 2), if we do it like RandomMatrixMap, we may lose some locality
> informations of the table. so that the network transfer overhead may be
> increase.
>
> It is just my guess and thoughts.
>
>
>>
>>
>> What do you think?
>>
>> --
>> Best Regards, Edward J. Yoon @ NHN, corp.
>> edwardyoon@apache.org
>> http://blog.udanax.org
>>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message