hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jane Wayne <jane.wayne2...@gmail.com>
Subject Re: strategies to share information between mapreduce tasks
Date Wed, 26 Sep 2012 16:19:28 GMT
my problem is more general (than graph problems) and doesn't need to
have logic built around synchronization or failure. for example, when
a mapper is finished successfully, it just writes/persists to a
storage location (could be disk, could be database, could be memory,
etc...). when the next input is processed (could be on the same mapper
or different mapper), i just need to do a lookup from the storage
location (that is accessible by all task nodes). if the mapper fails,
this doesn't hurt my processing, although i would like for no failures
(and it's good if hadoop can spawn another task to mitigate).

On Wed, Sep 26, 2012 at 11:43 AM, Bertrand Dechoux <dechouxb@gmail.com> wrote:
> The difficulty with data transfer between tasks is handling synchronisation
> and failure.
> You may want to look at graph processing done on top of Hadoop (like
> Giraph).
> That's one way to do it but whether it is relevant or not to you will
> depend on your context.
> Regards
> Bertrand
> On Wed, Sep 26, 2012 at 5:36 PM, Jane Wayne <jane.wayne2978@gmail.com>wrote:
>> hi,
>> i know that some algorithms cannot be parallelized and adapted to the
>> mapreduce paradigm. however, i have noticed that in most cases where i
>> find myself struggling to express an algorithm in mapreduce, the
>> problem is mainly due to no ability to cross-communicate between
>> mappers or reducers.
>> one naive approach i've seen mentioned here and elsewhere, is to use a
>> database to store data for use by all the mappers. however, i have
>> seen many arguments (that i agree with largely) against this approach.
>> in general, my question is this: has anyone tried to implement an
>> algorithm using mapreduce where mappers required cross-communications?
>> how did you solve this limitation of mapreduce?
>> thanks,
>> jane.
> --
> Bertrand Dechoux

View raw message