hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boyu Zhang <boyuzhan...@gmail.com>
Subject Re: How To Run Multiple Map & Reduce Functions In One Job
Date Fri, 04 Sep 2009 19:22:02 GMT
Yes, the output of the first iteration is the input of the second iteration.
Actually, I am trying the page ranking problem. In the algorithm, you have
to run several iterations each using the output of previous iteration as
input and producing the output for latter.

It is not a real life application, I just want to try some applications with
iterations. Thanks a lot!

Boyu

On Fri, Sep 4, 2009 at 2:51 PM, Amandeep Khurana <amansk@gmail.com> wrote:

> Wait.. Why are you using the same mapper and reducer and calling it 10
> times? Is the output of the first iteration being input into the second
> one?
> What are these jobs doing? Tell a bit more about that. There might be a way
> by which you can club some jobs together into one job and reduce the
> overheads...
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Fri, Sep 4, 2009 at 11:48 AM, Boyu Zhang <boyuzhang35@gmail.com> wrote:
>
> > Dear Amandeep,
> >
> > Thanks for the fast reply. I will try the method you mentioned.
> >
> >  In my understanding, when a job is submitted, there will be a separate
> > java
> > process in jobtracker responsible for that job. And there will be an
> > initialization and cleanup cost for each job. If every iteration is a new
> > job, they will be created sequentially by the jobtracker. Say, there are
> 10
> > iterations in my code, there will be 10 jobs submitted to the jobtracker.
> I
> > am just thinking is there a way to just submit 1 job,  but run 10
> > iterations, since they are using the same mapper and reducer classes.
> That
> > is basiclly why I think they are costly, maybe there is something that I
> > misunderstood. I hope you could share it with me if I was wrong.
> >
> > Again, thanks a lot for replying!
> >
> > Boyu
> >
> > On Fri, Sep 4, 2009 at 2:39 PM, Amandeep Khurana <amansk@gmail.com>
> wrote:
> >
> > > You can create different mapper and reducer classes and create separate
> > job
> > > configs for them. You can pass these different configs to the Tool
> object
> > > in
> > > the same parent class... But they will essentially be different jobs
> > being
> > > called together from inside the same java parent class.
> > >
> > > Why do you say it costs a lot? Whats the issue??
> > >
> > >
> > > Amandeep Khurana
> > > Computer Science Graduate Student
> > > University of California, Santa Cruz
> > >
> > >
> > > On Fri, Sep 4, 2009 at 11:36 AM, Boyu Zhang <boyuzhang35@gmail.com>
> > wrote:
> > >
> > > > Dear All,
> > > >
> > > > I am using Hadoop 0.20.0. I have an application that needs to run
> > > > map-reduce
> > > > functions iteratively. Right now, the way I am doing this is new a
> Job
> > > for
> > > > each pass of the map-reduce. That seems cost a lot. Is there any way
> to
> > > run
> > > > map-reduce functions iteratively in one Job?
> > > >
> > > > Thanks a lot for your time!
> > > >
> > > > Boyu Zhang(Emma)
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message