hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Blaisdell" <lunk.dj...@gmail.com>
Subject Re: Percent progress of map/reduce in JobClient
Date Wed, 04 Jun 2008 22:53:34 GMT
Is the map progress indicator computed as a percentage of maps completed?

-Daniel

On Wed, Jun 4, 2008 at 6:51 PM, Tanton Gibbs <tanton.gibbs@gmail.com> wrote:

> From what I've read, there are three reduce phases 1. copy 2. sort 3.
> reduce
> From 0 - 33% is the copy phase.  I guess if you don't need that phase
> it could skip this completely.
> After 33%, it waits until it is done sorting before outputting status
> again at 66%, then it updates regularly during the reduce phase to
> 100%.  This has been my experience, at least.
>
> Tanton
>
> On Wed, Jun 4, 2008 at 4:19 PM, Stuart Sierra <mail@stuartsierra.com>
> wrote:
> > How does Hadoop decide when to update the "percent complete" for
> > map/reduce tasks?  I've been running a small job (~150 MB) on a
> > pseudo-distributed cluster.  "bin/hadoop jar" prints:
> >
> > 08/06/04 17:02:16 INFO mapred.JobClient:  map 0% reduce 0%
> > 08/06/04 17:05:52 INFO mapred.JobClient:  map 100% reduce 0%
> > 08/06/04 17:06:05 INFO mapred.JobClient:  map 100% reduce 66%
> > 08/06/04 17:06:10 INFO mapred.JobClient:  map 100% reduce 67%
> > 08/06/04 17:06:17 INFO mapred.JobClient:  map 100% reduce 68%
> >
> > And so on until the job completes.  What seems odd is that I don't get
> > any feedback at all on the progress of the map task until it reaches
> > 100%, and I get no feedback on the reduce task until it reaches 66%.
> > After that, I get updates every few seconds.  The TaskTracker shows
> > the same thing.  What might cause this?
> >
> > This is Hadoop 0.17.  The input and output are both text, both ~140MB,
> > gzip-compressed down to ~12MB.
> >
> > Thanks,
> > -Stuart
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message