hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From M B <machac...@gmail.com>
Subject Reducer-side join example
Date Mon, 05 Apr 2010 21:10:35 GMT
Hi, I need a good java example to get me started with some joining we need
to do, any examples would be appreciated.

File A:
Field1  Field2
A        12
B        13
C        22
A        24

File B:
 Field1  Field2   Field3
A        Car       ...
B        Truck    ...
B        SUV     ...
B        Van      ...

So, we need to first join File A and B on Field1 (say both are string
fields).  The result would just be:
A   12   Car   ...
A   24   Car   ...
B   13   Truck   ...
B   13   SUV   ...
 B   13   Van   ...
and so on - with all the fields from both files returning.

Once we have that, we sometimes need to then transform it so we have a
single record per key (Field1):
A (12,Car) (24,Car)
B (13,Truck) (13,SUV) (13,Van)
--however it looks, basically tuples for each key (we'll modify this later
to return a conatenated set of fields from B, etc)

At other times, instead of transforming to a single row, we just need to
modify rows based on values.  So if B.Field2 equals "Van", we need to set
Output.Field2 = whatever then output to file ...

Are there any good examples of this in native java (we can't use


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message