hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Becker <_martinbec...@web.de>
Subject Re: Passing messages
Date Sat, 18 Dec 2010 23:19:17 GMT
Hello Jason,

real time values are not required. Some lagging is tolerable. The
value/threshold communication is only needed to keep other reducers
from doing unnecessary work. Some upper bound would be nice to know,
though. A single reducer is not an option for my algorithm. That would
defeat the purpose of using MapReduce. I still would like to know, if
there is something more general than Counters. Using other data types
would be convenient. Even better would be, as mentioned, some kind of
simple message passing system. It seems that neither is supported by
Hadoop as of now?

Thank you,

On Sat, Dec 18, 2010 at 10:07 PM, Jason <urgisb@gmail.com> wrote:
>> Reducers would retrieve that increased value when accessing the same
>> Counter?
> I do not think counters reflect real time value. Even if they get updated the values
will lag.
> If you require uptodate value I am afraid you will have to run a single reducer.
> Sent from my iPhone 4
> On Dec 18, 2010, at 10:33 AM, Martin Becker <_martinbecker@web.de> wrote:
>> Thank you Ted,
>> I am using the 21.0 API so I would be drawing Counters from the
>> Context. So if a Counter is increased on a certain Reducer other
>> Reducers would retrieve that increased value when accessing the same
>> Counter? If so, then that is an interesting piece of information.
>> Unfortunately my threshold are doubles. I guess, I could find some
>> kind of conversion there. But is there any more general way to pass
>> information between Reducers?
>> Thanks,
>> Martin
>> On Sat, Dec 18, 2010 at 5:44 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> In your reducer, you can utilize Reporter (getCounter and incrCounter
>>> methods) to pass this information between reducers.
>>> On Sat, Dec 18, 2010 at 8:04 AM, Martin Becker <_martinbecker@web.de> wrote:
>>>> Hello everbody,
>>>> I am wondering if there is a feature allowing (in my case) reduce
>>>> tasks to communicate. For example by some volatile variables at some
>>>> centralized point. Or maybe just notify other running or to-be-running
>>>> reduce tasks of a completed reduce task featuring some arguments.
>>>> In my case, I have reduce tasks doing computations that will
>>>> output/produce certain quality threshold. Other reduce tasks can/could
>>>> estimate, if they ever get above those thresholds. If not they could
>>>> just cease running.
>>>> Thanks in advance,
>>>> Martin
>>>> PS: If such functions are not yet part of the API, I would like to
>>>> know if there are good reasons for it and if not, propose to introduce
>>>> such functionality.

View raw message