hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kawa <kawa.a...@gmail.com>
Subject Re: The number of simultaneous map tasks is unexpected.
Date Tue, 08 Jul 2014 22:56:24 GMT
If you run an application (e.g. MapReduce job) on YARN cluster, first the
Application Master will be is started on some slave node to coordinate the
execution of all tasks within the job. The ApplicationMaster and tasks that
belong to its application run in the containers controlled by the
NodeManagers.

Maybe, you simply run 8 containers on your YARN cluster and 1 container is
consumed by MapReduce AppMaster and 7 containers are consumed by map tasks.
But it seems not to be a root cause of you problem, because according to
your settings you should be able to run 16 containers maximally.

Another idea might be that your are bottlenecked by the amount of memory on
the cluster (each container consumes memory) and despite having vcore(s)
available, you can not launch new tasks. When you go to the ResourceManager
Web UI, do you see that you utilize whole cluster memory?



2014-07-08 21:06 GMT+02:00 Tomasz Guziałek <tomasz@guzialek.info>:

> I was not precise when describing my cluster. I have 4 slave nodes and a
> separate master node. The master has ResourceManager role (along with
> JobHistory role) and the rest have NodeManager roles. If this really is an
> ApplicationMaster, is it possible to schedule it on the master node? This
> single waiting map task is doubling my execution time.
>
> Pozdrawiam / Regards / Med venlig hilsen
> Tomasz Guziałek
>
>
> 2014-07-08 18:42 GMT+02:00 Adam Kawa <kawa.adam@gmail.com>:
>
> Is not your MapReduce AppMaster occupying one slot?
>>
>> Sent from my iPhone
>>
>> > On 8 jul 2014, at 13:01, Tomasz Guziałek <tomaszguzialek@gmail.com>
>> wrote:
>> >
>> > Hello all,
>> >
>> > I am running a 4-nodes CDH5 cluster on Amazon EC2 . The instances used
>> are m1.large, so I have 4 cores (2 core x 2 unit) per node. My HBase table
>> has 8 regions, so I expected at least 8 (if not 16) mapper tasks to run
>> simultaneously. However, only 7 are running and 1 is waiting for an empty
>> slot. Why this surprising number came up? I have checked that the regions
>> are equally distributed on the region servers (2 per node).
>> >
>> > My properties in the job:
>> > Configuration mapReduceConfiguration = HBaseConfiguration.create();
>> > mapReduceConfiguration.set("hbase.client.max.perregion.tasks", "4");
>> > mapReduceConfiguration.set("mapreduce.tasktracker.map.tasks.maximum",
>> "16");
>> >
>> > My properties in the CDH:
>> > yarn.scheduler.minimum-allocation-vcores = 1
>> > yarn.scheduler.maximum-allocation-vcores = 4
>> >
>> > Do I miss some property? Please share your experience.
>> >
>> > Best regards
>> > Tomasz
>>
>
>

Mime
View raw message