hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rusia, Devansh" <dru...@paypal.com>
Subject FW: TupleWritable value in mapper Not getting cleaned up ( using CompositeInputFormat )
Date Fri, 22 Mar 2013 04:50:39 GMT
Hi,

I am trying to do an outer join on to input files.

But while joining the TupleWritable value in the mapper is not getting cleaned up and so is
using the previous values of a different key.

The code I used is : (  'plist' is containing the set of paths to be taken as input )

jobConf.setInputFormat(CompositeInputFormat.class);
jobConf.set("mapred.join.expr", CompositeInputFormat.compose(op, inputFormatClass,plist.toArray(new
Path[0])));
jobConf.setOutputFormat(outputFormatClass);

inp1:

anil1     10
anil2     20
anil3     30
dev1     40
dev2     50

inp2:

anil1     100
dev1     400
dev2     500
dev3     600


outer join output:

anil1     10,100
anil2     20,100
anil3     30,100
dev1     40,400
dev2     50,500
dev3     50,600

Actually It should be, right?

anil1     10,100
anil2     20
anil3     30
dev1     40,400
dev2     50,500
dev3     600

Regards,
Devansh Rusia

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message