hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudharsan Sampath <sudha...@gmail.com>
Subject Re: No Mapper but Reducer
Date Wed, 07 Sep 2011 09:39:26 GMT
This is true and it took as off by surprise in recent past. Also, it had
quite some impact on our job cycles where the size of input is totally
random and could also be zero at times.

In one of our cycles, we run a lot of jobs. Say we configure X as the num of
reducers for a job which does not have any input.

Y -> No of tasktrackers in the cluster

H -> Time Interval for Heartbeat response

With the cdh2 version, the job takes,

( X / Y) * H seconds to complete without doing any work since we assign only
one reduce task per heartbeat


If the number of such jobs in the cycle is more, then the total time that
the cluster spends doing nothing accumulates.

I was thinking of raising this as a jira but not sure. Should we raise and
fix this as jira request? Num of reducers set by the client can be overriden
if the number of mappers is 0?

We have a way to hack, by verifying the existence of the input path to the
Map phase ourselves but just thought would be more intuitive for the
framework to handle itself

-Sudhan S

On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <harsh@cloudera.com> wrote:

> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
> job ;-)
>
> /me puts his troll-mask on.
>
> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc
> out
> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process
> : 0
> 11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
> job_201109071413_0001
> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>
> /me takes off troll mask.
>
> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
> > Thanks Sonal. I was just thinking of some weird design and wanted to make
> > sure whether there is a possibility like that- no maps and all reducers.
> >
> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <sonalgoyal4@gmail.com>
> wrote:
> >>
> >> I dont think that is possible, can you explain in what scenario you want
> >> to have no mappers, only reducers?
> >> Best Regards,
> >> Sonal
> >> Crux: Reporting for HBase
> >> Nube Technologies
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <bejoy.hadoop@gmail.com>
> wrote:
> >>>
> >>> Hi
> >>>           I'm having a query here. Is it possible to have no mappers
> but
> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
> we can
> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
> how
> >>> can it be achieved if possible?
> >>>
> >>> Thank You
> >>>
> >>> Regards
> >>> Bejoy.K.S
> >>
> >
> >
>
>
>
> --
> Harsh J
>

Mime
View raw message