hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: Hadoop: Multiple map reduce or some better way
Date Thu, 27 Mar 2008 04:14:06 GMT
On Wed, 26 Mar 2008, Aayush Garg wrote:

> HI,
> I am developing the simple inverted index program frm the hadoop. My map
> function has the output:
> <word, doc>
> and the reducer has:
> <word, list(docs)>
> Now I want to use one more mapreduce to remove stop and scrub words from
Use distributed cache as Arun mentioned.
> this output. Also in the next stage I would like to have short summay
Whether to use a separate MR job depends on what exactly you mean by
summary. If its like a window around the current word then you can
possibly do it in one go.
> associated with every word. How should I design my program from this stage?
> I mean how would I apply multiple mapreduce to this? What would be the
> better way to perform this?
> Thanks,
> Regards,
> -
> Aayush Garg,
> Phone: +41 76 482 240

View raw message