hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guo Leitao <leitao....@gmail.com>
Subject Re: Why hadoop jobs need setup and cleanup phases which would consume a lot of time ?
Date Thu, 11 Mar 2010 13:56:05 GMT
>From our test of hadoop-0.20.1 on 10 nodes, we find the setup period is
longer as more jobs are submitted. I don't know why maptask for setup is
needed, why not jobtracker or one thread takes over this work?

2010/3/11 Jeff Zhang <zjffdu@gmail.com>

> Hi Zhou,
>
> I look at the source code, it seems  it is the JobTracker initiate the
> setup
> and cleanup task.
> And why do you think the setup and cleanup phases consume a lot of time,
> actually the time cost is depend on the OutputCommitter
>
>
>
>
> On Thu, Mar 11, 2010 at 11:04 AM, Min Zhou <coderplay@gmail.com> wrote:
>
> > Hi all,
> >
> > Why hadoop jobs need setup and cleanup phases which would  consume a
> > lot of time ? Why could not us archieve it like a distributed RDBMS
> > does  a master process coordinates all salve nodes  through  socket.
> > I think that will save plenty of time if there won't be any setups and
> > cleanups. What's hadoop philosophy on this?
> >
> > Thanks,
> > Min
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message