hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aanghelescu <axanghele...@gmail.com>
Subject providing the same input to more than one Map task
Date Fri, 22 Apr 2011 21:33:41 GMT

Hi all,

I am trying to perform matrix-vector multiplication using Hadoop. 

So I have matrix M in a file, and vector v in another file. Obviously, files
are of different sizes. Is it possible to make it so that each Map task will
get the whole vector v and a chunk of matrix M? I know how my map and reduce
functions should look like, but I don't know how to format the input. 

Basically I want my map function to output key-value pairs (i,m[i,j]*v(j)),
where i is the row number, and j the column number; v(j) is the jth element
in v. And the reduce function will sum up all the values with the same key -
i, and that will be the ith element of my result vector. 

Or can you suggest another way to do it?

View this message in context: http://old.nabble.com/providing-the-same-input-to-more-than-one-Map-task-tp31459012p31459012.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message