hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianmin Woo <jianmin_...@yahoo.com>
Subject Re: question about when shuffle/sort start working
Date Mon, 01 Jun 2009 06:30:48 GMT
Thanks for you quick response, Jothi.

Yes, actually I need each mapper process to handle several rounds of map/reduce works without
exiting this mapper process. I checked the ChainMapper/ChainReducer before, it only support
M(+)RM* mode chain of mapper and reducer. And more, it seems that how many mapper will be
used should be specified when configuring the job. I am thinking if it is possible to determine
how many round of map/reduce will take by the mapper itself. Do you think is this feasible,
or some hack on the hadoop framework to support this?

Thanks,
Jianmin




________________________________
From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
To: core-user@hadoop.apache.org
Sent: Monday, June 1, 2009 2:03:13 PM
Subject: Re: question about when shuffle/sort start working


No you cannot raise this event yourself, this event is generated internally
by the framework. 

I am guessing that what you probably want is to have a chain of MapReduce
Jobs where the output of one is automatically fed as input to another.  You
can look at these classes: JobControl and ChainMapper/ChainReducer.

Jothi

On 6/1/09 11:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:

> Thanks a lot for your explanation, Jothi.
> 
> So is this event generated by hadoop framework? Is there any API in mapper to
> fire this event? Actually, I am thinking to implement a mapper that will emit
> some <key, value> pairs, then fire this event to let the reducer works, the
> same mapper task then emit some other <key, value> pairs and repeat. Do you
> think is this logic feasible by current API?
> 
> Thanks,
> Jianmin
> 
> 
> 
> 
> 
> ________________________________
> From: Jothi Padmanabhan <jothipn@yahoo-inc.com>
> To: core-user@hadoop.apache.org
> Sent: Monday, June 1, 2009 12:26:31 PM
> Subject: Re: question about when shuffle/sort start working
> 
> When a Mapper completes, MapCompletionEvents are generated. Reducers try to
> fetch map outputs for a given map only on the receipt of such events.
> 
> Jothi
> 
> 
> On 5/30/09 10:00 AM, "Jianmin Woo" <jianmin_woo@yahoo.com> wrote:
> 
>> Hi, 
>> I am being confused by the protocol between mapper and reducer. When mapper
>> emitting the (key,value) pair done, is there any signal the mapper send out
>> to
>> hadoop framework in protocol to indicate that map is done and the
>> shuffle/sort
>> can begin for reducer? If there is no this signal in protocol, when the
>> framework begin the shuffle/sort?
>> 
>> Thanks,
>> Jianmin
>> 
>> 
>> 
>>      
> 
> 
>      


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message