hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Kendall <mkend...@justin.tv>
Subject Re: Combiner phase question
Date Sat, 05 Dec 2009 00:29:15 GMT
from what i understand, the combiner runs when nodes are idle and
you're waiting on a few processes that are taking too long...  so the
cluster tries to optimize by putting these idle nodes to work by doing
optional preprocessing...

On Fri, Dec 4, 2009 at 2:02 PM, Raymond Jennings III
<raymondjiii@yahoo.com> wrote:
> I still would like to know how many times it will run given how many mappers run.  I
realize it may never run but what determines how many times if any?
>
> --- On Fri, 12/4/09, Mike Kendall <mkendall@justin.tv> wrote:
>
>> From: Mike Kendall <mkendall@justin.tv>
>> Subject: Re: Combiner phase question
>> To: common-user@hadoop.apache.org
>> Date: Friday, December 4, 2009, 4:59 PM
>> are you sure it can be run in the
>> reduce task?  if it does it's still
>> before the reducer is called though...  so the flow of
>> your data will
>> still be: data -> mapper(s) -> optional reducer(s)
>> -> reducer(s) ->
>> output_data
>>
>>
>>
>> On Fri, Dec 4, 2009 at 1:42 PM, Owen O'Malley <owen.omalley@gmail.com>
>> wrote:
>> > On Fri, Dec 4, 2009 at 12:32 PM, Raymond Jennings III
>> <raymondjiii@yahoo.com
>> >> wrote:
>> >
>> >> Does the combiner run once per data node or one
>> per map task?  (That it can
>> >> run multiple times on the same data node after
>> each map task.)  Thanks.
>> >>
>> >
>> > The combiner can run 0, 1, or many times on each data
>> value. It can run in
>> > both the map task and reduce task.
>> >
>> > -- Owen
>> >
>>
>
>
>
>

Mime
View raw message