hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy KS <bejoy.had...@gmail.com>
Subject Re: No Mapper but Reducer
Date Wed, 07 Sep 2011 11:21:21 GMT
Thank You All. Even I have noticed this strange behavior some time back.
Now my inital concern still remains.  If I provide my input directory an
empty one, yes the map tasks wont be executed .But my reducer needs  input
to do the processing/ aggregation. In such a scenario, is there an option to
provide input just to the reducer?

Regards
Bejoy.K.S

On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <sudhan65@gmail.com>wrote:

> This is true and it took as off by surprise in recent past. Also, it had
> quite some impact on our job cycles where the size of input is totally
> random and could also be zero at times.
>
> In one of our cycles, we run a lot of jobs. Say we configure X as the num
> of reducers for a job which does not have any input.
>
> Y -> No of tasktrackers in the cluster
>
> H -> Time Interval for Heartbeat response
>
> With the cdh2 version, the job takes,
>
> ( X / Y) * H seconds to complete without doing any work since we assign
> only one reduce task per heartbeat
>
>
> If the number of such jobs in the cycle is more, then the total time that
> the cluster spends doing nothing accumulates.
>
> I was thinking of raising this as a jira but not sure. Should we raise and
> fix this as jira request? Num of reducers set by the client can be overriden
> if the number of mappers is 0?
>
> We have a way to hack, by verifying the existence of the input path to the
> Map phase ourselves but just thought would be more intuitive for the
> framework to handle itself
>
> -Sudhan S
>
> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
>> job ;-)
>>
>> /me puts his troll-mask on.
>>
>> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
>> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
>> abc out
>> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process
>> : 0
>> 11/09/07 14:24:14 INFO mapred.JobClient: Running job:
>> job_201109071413_0001
>> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
>> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
>> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
>> job_201109071413_0001
>> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
>> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>> reduces waiting after reserving slots (ms)=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>> maps waiting after reserving slots (ms)=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
>> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
>> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
>> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>>
>> /me takes off troll mask.
>>
>> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
>> > Thanks Sonal. I was just thinking of some weird design and wanted to
>> make
>> > sure whether there is a possibility like that- no maps and all reducers.
>> >
>> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <sonalgoyal4@gmail.com>
>> wrote:
>> >>
>> >> I dont think that is possible, can you explain in what scenario you
>> want
>> >> to have no mappers, only reducers?
>> >> Best Regards,
>> >> Sonal
>> >> Crux: Reporting for HBase
>> >> Nube Technologies
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <bejoy.hadoop@gmail.com>
>> wrote:
>> >>>
>> >>> Hi
>> >>>           I'm having a query here. Is it possible to have no mappers
>> but
>> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
>> we can
>> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
>> how
>> >>> can it be achieved if possible?
>> >>>
>> >>> Thank You
>> >>>
>> >>> Regards
>> >>> Bejoy.K.S
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
View raw message