hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Could we use different output Format for the Mapper and Combiner?
Date Wed, 16 Feb 2011 11:47:27 GMT
The combiner must "have the same input and output key types and the
same input and output value types" (as per the docs for setting one.)

The combined outputs are treated as typical map outputs after
processing, so that the reducer still applies on it properly. For this
to work, your combiner can't change the types expected by the Reducer
from the Mappers as input. Perhaps making the map itself emit <Text,
MapWritable> in some form may help you still use your combiner
(although with a little more expense).

On Wed, Feb 16, 2011 at 4:32 PM, Stanley Xu <wenhao.xu@gmail.com> wrote:
> Dear all,
> I am writing a map-reduce job today. Which I hope I could use different
> format for the Mapper and Combiner. I am using the Text as the format of the
> Mapper and MapWritable as the format of the format.
> But it looks the hadoop didn't support that yet?
> I have some code like the following:
> public class RawLogMapper extends Mapper<LongWritable, Text, Text, Text> {
> public class RawLogCombiner extends Reducer<Text, Text, Text, MapWritable> {
> job.setMapOutputKeyClass(Text.class);
> job.setMapOutputValueClass(Text.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(MapWritable.class);
> job.setOutputFormatClass(TextOutputFormat.class);
> But it failed and the logs told me that there are type mismatch. Is there
> anyway I could use different type for the VALUEOUT for the mapper and
> combiner?
> Thanks
> Best wishes,
> Xu Wenhao

Harsh J

View raw message