hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj k <devara...@huawei.com>
Subject RE: CompositeInputFormat
Date Fri, 12 Jul 2013 01:21:41 GMT
Hi Andrew,

You could make use of hadoop data join classes to perform the join or you can refer these
classes for better idea to perform join.


Devaraj k

From: Botelho, Andrew [mailto:Andrew.Botelho@emc.com]
Sent: 12 July 2013 03:33
To: user@hadoop.apache.org
Subject: RE: CompositeInputFormat

Sorry I should've specified that I need an example of CompositeInputFormat that uses the new
The example linked below uses old API objects like JobConf.

Any known examples of CompositeInputFormat using the new API?

Thanks in advance,


From: Jay Vyas [mailto:jayunit100@gmail.com]
Sent: Thursday, July 11, 2013 5:10 PM
To: common-user@hadoop.apache.org<mailto:common-user@hadoop.apache.org>
Subject: Re: CompositeInputFormat

Map Side joins will use the CompositeInputFormat.  They will only really be worth doing if
one data set is small, and the other is large.
This is a good example : http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/
the trick is to google for CompositeInputFormat.compose() .... :)

On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew <Andrew.Botelho@emc.com<mailto:Andrew.Botelho@emc.com>>

I want to perform a JOIN on two sets of data with Hadoop.  I read that the class CompositeInputFormat
can be used to perform joins on data, but I can't find any examples of how to do it.
Could someone help me out? It would be much appreciated. :)

Thanks in advance,


Jay Vyas

View raw message