hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jane Wayne <jane.wayne2...@gmail.com>
Subject strategies to share information between mapreduce tasks
Date Wed, 26 Sep 2012 15:36:09 GMT

i know that some algorithms cannot be parallelized and adapted to the
mapreduce paradigm. however, i have noticed that in most cases where i
find myself struggling to express an algorithm in mapreduce, the
problem is mainly due to no ability to cross-communicate between
mappers or reducers.

one naive approach i've seen mentioned here and elsewhere, is to use a
database to store data for use by all the mappers. however, i have
seen many arguments (that i agree with largely) against this approach.

in general, my question is this: has anyone tried to implement an
algorithm using mapreduce where mappers required cross-communications?
how did you solve this limitation of mapreduce?



View raw message