hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From welman Lu <welmanwe...@gmail.com>
Subject Re: Writing Bytes from Map and Reduce functions
Date Wed, 31 Mar 2010 07:15:28 GMT
Hi, Saliya,

The data transformation in MapReduce is:

*map*    (k1,v1)        -> list(k2,list(v2))
*reduce* (k2, list(v2)) -> (k3, list(v3))

The output from map will be sent to reducer as input directly. In your
recude function, you can only get k2, v2 as input type. So, in your case,
the type of the data should be:
k1 = Text | v1 = Text
k2 = Text | v2 = BytesWritable
k3 = Text | v3 = BytesWritable

Hence for your code, I think you can write:
In job configuration:
JobConf conf = new JobConf(YourClass.class);
conf.setOutputKeyClass(k3.class);
conf.setOutputValueClass(v3.class);

then in map class, set the map class as:
class YourMapClass extends MapReduceBase
    implements Mapper<k1, v1, k2, v2> {
....
}

If your v3 is different from the v2, then you can in the job configuration
set
conf.setMapOutputKeyClass(k2.class);
conf.setMapOutputValueClass(v2.class);

Hope this can help you!


Best Regards
Jiamin Lu

Mime
View raw message