hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re:
Date Wed, 18 Jul 2007 15:26:07 GMT


Would the MapFile class help?


On 7/17/07 10:05 PM, "Sandhya E" <sandhyabhaskar@gmail.com> wrote:

> Hi
> 
> I have two MapReduces running sequentially to accomplish a job. I first
> started running the jobs locally in a single machine.
> First MapReduce produces a set of keys which were stored inmemory in a Set
> instead of output.collect in the reduce. and the second MapReduce working on
> different input files looked up the keys from the Set to act on the input
> lines. But now I want to run the MapReduces on a small cluster. In memory
> storage will not work here. How can the second Map running on various
> machines load all the keys from first MapReduce before it starts working on
> input files. Any ideas..?
> 
> Many Thanks
> Sandhya


Mime
View raw message