hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@apache.org>
Subject Re: Map and Reduce tasks not restricted by setNumMapTasks and setNumReduceTasks in JobConfiguration class - JobConf class related?
Date Fri, 29 Aug 2008 20:42:59 GMT
Andy,

Yes, it's the total number of map tasks launched during the MR job. If it
was mapping a table, it implies that you have 600+ regions.

J-D

On Fri, Aug 29, 2008 at 4:38 PM, Andy Li <annndy.lee@gmail.com> wrote:

> To whom it may concern,
>
> I have invoked JobClient to fetch the Max Mappers and Reducers supported by
> the cluster and when I launched the MR job,
> I see some huge numbers like the following from the MR output.
>
> Invoked command.....
> ----
> -m = 20
> -r = 5
> -pi = /test/data/20080828/22/
> -po = /test/data/sample_out2/
> -jm = Xmx64m
> -srcID = 1
> Cluster support max Mappers: 84  (from cs.getMaxMapTasks())
> Cluster support max Reducers: 84 (from cs.getMaxReduceTasks())
> 08/08/29 16:18:06 INFO mapred.FileInputFormat: Total input paths to process
> : 30
> 08/08/29 16:18:07 INFO mapred.JobClient: Running job: job_200808221900_0133
> 08/08/29 16:18:08 INFO mapred.JobClient:  map 0% reduce 0%
> 08/08/29 16:18:16 INFO mapred.JobClient:  map 2% reduce 0%
> 08/08/29 16:18:17 INFO mapred.JobClient:  map 3% reduce 0%
> 08/08/29 16:18:19 INFO mapred.JobClient:  map 4% reduce 0%
> .....
> 08/08/29 16:20:42 INFO mapred.JobClient:  map 100% reduce 73%
> 08/08/29 16:20:43 INFO mapred.JobClient:  map 100% reduce 86%
> 08/08/29 16:20:46 INFO mapred.JobClient:  map 100% reduce 100%
> 08/08/29 16:20:47 INFO mapred.JobClient: Job complete:
> job_200808221900_0133
> 08/08/29 16:20:47 INFO mapred.JobClient: Counters: 17
> 08/08/29 16:20:47 INFO mapred.JobClient:   File Systems
> 08/08/29 16:20:47 INFO mapred.JobClient:     Local bytes read=53303668
> 08/08/29 16:20:47 INFO mapred.JobClient:     Local bytes written=107722772
> 08/08/29 16:20:47 INFO mapred.JobClient:     HDFS bytes read=5091952163
> 08/08/29 16:20:47 INFO mapred.JobClient:     HDFS bytes written=6132111
> 08/08/29 16:20:47 INFO mapred.JobClient:   Job Counters*
> 08/08/29 16:20:47 INFO mapred.JobClient:     Launched map tasks=629
> 08/08/29 16:20:47 INFO mapred.JobClient:     Launched reduce tasks=7
> 08/08/29 16:20:47 INFO mapred.JobClient:     Data-local map tasks=606
> 08/08/29 16:20:47 INFO mapred.JobClient:     Rack-local map tasks=11*
> 08/08/29 16:20:47 INFO mapred.JobClient:   Map-Reduce Framework
> 08/08/29 16:20:47 INFO mapred.JobClient:     Map input records=29976603
> 08/08/29 16:20:47 INFO mapred.JobClient:     Map output records=29974115
> 08/08/29 16:20:47 INFO mapred.JobClient:     Map input bytes=5089555416
> 08/08/29 16:20:47 INFO mapred.JobClient:     Map output bytes=421953213
> 08/08/29 16:20:47 INFO mapred.JobClient:     Combine input records=29974115
> 08/08/29 16:20:47 INFO mapred.JobClient:     Combine output records=2199921
> 08/08/29 16:20:47 INFO mapred.JobClient:     Reduce input groups=347890
> 08/08/29 16:20:47 INFO mapred.JobClient:     Reduce input records=2199921
> 08/08/29 16:20:47 INFO mapred.JobClient:     Reduce output records=347890
>
> Does anyone know why the numbers for 'Launched map tasks" and "Launched
> reduce tasks" are higher than
> the cluster's supported Mapper and Reducer numbers?
>
> When I invoke:
>
> JobClient job = new JobClient(jconf);
> ClusterStatus cs = job.getClusterStatus();
> int max_map_num = cs.getMaxMapTasks();
> int max_red_num = cs.getMaxReduceTasks();
>
> max_map_num and max_red_num only show 84.  Not like the number from the
> console.
>
> Any idea? Does 'Launched map tasks" means the total maps being launched
> during the entire job?
> For example, forked 10 maps, killed 10 maps, forked 20 maps, killed 20
> maps,
> forked 35 maps, killed 35 maps, which give the total of 65 maps launched?
>
> Thanks,
> Andy
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message