hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: Combiner and MultipleOutputs in Mapreduce
Date Wed, 06 Oct 2010 04:54:55 GMT
Shi,
>From the MultipleOutputs javadocs:

When named outputs are used within a Mapper implementation, key/values
written to a name output are not part of the reduce phase, only
key/values written to the job OutputCollector are part of the reduce
phase.

Hope this helps.

Alejandro

On Wed, Oct 6, 2010 at 12:44 PM, ShiYu <shiyu@uchicago.edu> wrote:
>
> Hi,
>
> most of the example code I read has the following configuration (using the
> same Reduce class as the Combiner and the Reducer)
>
> conf.setCombinerClass(Reduce.class);
> conf.setReducerClass(Reduce.class);
>
> To let the combiner and reducer run, does it require that the input <K2,V2>
> of Reduce class and the output <K2,V2> should be the same? I guess otherwise
> the types will NOT be incompatible? In the current version, is it possible
> to have more complicated Combiner and Reducer, such as supporting <K2,V2> as
> input and <K3,V3> as output?
>
> However, after I tried some simple experiments when using MultipleOutputs, I
> found that if the Combiner class is set, the Reducer would never be invoked.
> I am using Hadoop 0.19.2 package. It seems that the MultipleOutputs object
> robs away the output of combiner so the Reducer cannot get the input. The
> default logs of program indicate "Reduce input records=0" and "Reduce output
> records=0", moreover, the output files are the same number of the input
> files. Also in the Combiner only has input record, but no output thus
> "Combine output records=0".  My question is when using MultipleOutputs
> object, how to invoke the data flow between the Combiner and the Reducer?
>
> Thanks for any suggestion.
>
> Shi
>
>
>
>
>
> --
> View this message in context: http://old.nabble.com/Combiner-and-MultipleOutputs-in-Mapreduce-tp29893459p29893459.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>

Mime
View raw message