hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: reading input for a map function from 2 different files?
Date Mon, 10 Nov 2008 09:22:28 GMT
Amar Kamat wrote:
> some speed wrote:
>> I was wondering if it was possible to read the input for a map 
>> function from
>> 2 different files:
>>   1st file ---> user-input file from a particular location(path)
Is the input/user file sorted? If yes then you can use "map-side join" 
for performance reasons. See org.apache.hadoop.mapred.join for more 
>> 2nd file=---> A resultant file (has just one <key,value> pair) from a
>> previous MapReduce job. (I am implementing a chain MapReduce function)
Can you explain in more detail the contents of 2nd file?
>> Now, for every <key,value> pair in the user-input file, I would like 
>> to use
>> the same <key,value> pair from the 2nd file for some calculations.
Can you explain this in more detail? Can you give some abstracted 
example of how file1 and file2 look like and what operation/processing 
you want to do?
> I guess you might need to do some kind of join on the 2 files. Look at 
> contrib/data_join for more details.
> Amar
>> Is it possible for me to do so? Can someone guide me in the right 
>> direction
>> please?
>> Thanks!

View raw message