Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of bejoy.hadoop@gmail.com
 designates 209.85.210.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAFJ6Lv9uADgeCJfqe7ka35pgAggyN7-bM0Sd=zPTgj873MN+QQ@mail.gmail.com>
References: 
 <CAFJ6Lv9uADgeCJfqe7ka35pgAggyN7-bM0Sd=zPTgj873MN+QQ@mail.gmail.com>
Date: Tue, 17 Apr 2012 08:33:01 +0530
Message-ID: 
 <CACD21ENgcDnbn8KoRCcogHgsScqhmJjz130HkMFCXsxc08gUyw@mail.gmail.com>
Subject: Re: map and reduce with different value classes
From: Bejoy Ks <bejoy.hadoop@gmail.com>
To: mapreduce-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

HI Bryan
=A0 =A0 =A0You can set=A0different=A0key and value types with the following=
 steps
- ensure that the map output key value type is the reducer input key value =
type
- specify it on your Driver Class as

//set map output key value types
job.setMapOutputKeyClass(theClass)
=A0 job.setMapOutputValueClass(theClass)

//set final/reduce output key value types
=A0 =A0 =A0 =A0 job.setOutputKeyClass(Text.class);
=A0 =A0 =A0 =A0 job.setOutputValueClass(IntWritable.class)

If both map output and reduce output key value types are the same you
just need to specify the final output types.

Regards
Bejoy KS


On Tue, Apr 17, 2012 at 7:14 AM, Bryan Yeung <bryeung@gmail.com> wrote:
>
> Hello Everyone,
>
> I'm relatively new to hadoop mapreduce and I'm trying to get this
> simple modification to the WordCount example to work.
>
> I'm using hadoop-1.0.2, and I've included both a convenient diff and
> also attached my new WordCount.java file.
>
> The thing I am trying to achieve is to have the value class that is
> output by the map phase be different than the value class output by
> the reduce phase.
>
> Any help would be greatly appreciated!
>
> Thanks,
>
> Bryan
>
> diff --git a/WordCount.java.orig b/WordCount.java
> index 81a6c21..6a768f7 100644
> --- a/WordCount.java.orig
> +++ b/WordCount.java
> @@ -33,8 +33,8 @@ public class WordCount {
> =A0 }
>
> =A0 public static class IntSumReducer
> - =A0 =A0 =A0 extends Reducer<Text,IntWritable,Text,IntWritable> {
> - =A0 =A0private IntWritable result =3D new IntWritable();
> + =A0 =A0 =A0 extends Reducer<Text,IntWritable,Text,Text> {
> + =A0 =A0private Text result =3D new Text();
>
> =A0 =A0 public void reduce(Text key, Iterable<IntWritable> values,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Context context
> @@ -43,7 +43,7 @@ public class WordCount {
> =A0 =A0 =A0 for (IntWritable val : values) {
> =A0 =A0 =A0 =A0 sum +=3D val.get();
> =A0 =A0 =A0 }
> - =A0 =A0 =A0result.set(sum);
> + =A0 =A0 =A0result.set("" + sum);
> =A0 =A0 =A0 context.write(key, result);
> =A0 =A0 }
> =A0 }
> @@ -58,10 +58,11 @@ public class WordCount {
> =A0 =A0 Job job =3D new Job(conf, "word count");
> =A0 =A0 job.setJarByClass(WordCount.class);
> =A0 =A0 job.setMapperClass(TokenizerMapper.class);
> + =A0 =A0 =A0 job.setMapOutputValueClass(IntWritable.class);
> =A0 =A0 job.setCombinerClass(IntSumReducer.class);
> =A0 =A0 job.setReducerClass(IntSumReducer.class);
> =A0 =A0 job.setOutputKeyClass(Text.class);
> - =A0 =A0job.setOutputValueClass(IntWritable.class);
> + =A0 =A0job.setOutputValueClass(Text.class);
> =A0 =A0 FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
> =A0 =A0 FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
> =A0 =A0 System.exit(job.waitForCompletion(true) ? 0 : 1);