hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pankil Doshi <forpan...@gmail.com>
Subject Re: Modeling WordCount in a different way
Date Tue, 14 Apr 2009 12:59:08 GMT

I am trying complex queries on hadoop and in which i require more than one
job to run to get final result..results of job one captures few joins of the
query and I want to pass those results as input to 2nd job and again do
processing so that I can get final results.queries are such that I cant do
all types of joins and filterin in job1 and so I require two jobs.

right now I write results of job 1 to hdfs and read dem for job2..but thats
take unecessary IO time.So was looking for something that I can store my
results of job1 in memory and use them as input for job 2.

do let me know if you need any  more details.

On Mon, Apr 13, 2009 at 9:51 PM, sharad agarwal <sharadag@yahoo-inc.com>wrote:

> Pankil Doshi wrote:
>> Hey
>> Did u find any class or way out for storing results of Job1 map/reduce in
>> memory and using that as an input to job2 map/Reduce?I am facing a
>> situation
>> where I need to do similar thing.If anyone can help me out..
>>  Normally you would write the job output to a file and input that to the
> next job.
> Any reason why you want to store the map reduce output in memory ? If you
> can describe your problem, perhaps it could be solved in more mapreduce-ish
> way.
> - Sharad

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message