hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Multithreaded Reducer
Date Fri, 10 Apr 2009 19:31:03 GMT
Hi Sagar!

There is no reason for the body of your reduce method to do more than copy
and queue the key value set into an execution pool.

The close method will need to wait until the all of the items finish
execution and potentially keep the heartbeat up with the task tracker by
periodically reporting something. Sadly right now the reporter has to be
grabbed from the reduce method as configure and close do not get an

I believe the key and value objects are reused by the framework on the next
call to reduce, so making a copy before queuing them into your thread pool
is important.

On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik <snaik@attributor.com> wrote:

> Hi,
> I would like to implement a Multi-threaded reducer.
> As per my understanding , the system does not have one coz we expect the
> output to be sorted.
> However, in my case I dont need the output sorted.
> Can u pl point to me any other issues or it would be safe to do so
> -Sagar

Alpha Chapters of my book on Hadoop are available

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message