hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Re: About slots of tasktracker and munber of map taskers
Date Mon, 12 Dec 2011 05:03:02 GMT

As an admin, I can even choose to set configuration to even 100 slots
on a 4-core node, if I feel like burning the box. There is no hardware
auto-detection, and the slot limit is entirely controlled by the
mapred-site.xml for that TaskTracker.

The book merely tries to tell that you need to set these maximum slot
settings based on your hardware knowledge on each node -- TaskTrackers
do nothing of that sort on their own.

There is some CPU/Memory considerations taken into account by a
variety of non-default Schedulers in JobTracker, but your slot limits
per tasktracker is entirely controlled by configuration.

2011/12/12 Tan Jun <tanjun_2525@163.com>:
> Hi Harsh,
> Now I know the number of maps and reduces run simultaneously is set by the
> administrator in mapred-site.xml with default value 2.
> But I cant get the point about number of slots.
> For my understanding by now,
> the number of  slots is decides by hardware that administrator cannot
> change.
> Is that wright?
> ________________________________
> Tan Jun
> From: Harsh J
> Date: 2011-12-12 12:22
> To: mapreduce-user
> Subject: Re: About slots of tasktracker and munber of map taskers
> Hi Tan,
> On 12-Dec-2011, at 8:48 AM, Tan Jun wrote:
> Hi,
> I dont really understand the meaning of the sentences in "The Definitive
> Guide"(page 155):
> Tasktrackers have a fixed number of slots for map tasks and for reduce tasks: for example,
> a tasktracker may be able to run two map tasks and two reduce tasks simultaneously.
> (The precise number depends on the number of cores and the amount of
> memory on the tasktracker; see “Memory” on page 254.)
> Does that mean the number of slots is fixed and the number of maps run
> simultaneously is set by user?
> Not by the user, but by the administrator. Each tasktracker is configured in
> production with a 'task slot' upper limit - say, 8 maps and 4 reducers for a
> 12-core machine. This is not auto-configured (unless you use auto cluster
> setup+configuration tools that determine it for you [0]), and has to be set
> when configuring Hadoop daemons.
> The book means to imply that you need to set these, based on the memory and
> CPU configuration of your machines. By default, tasktrackers have limits of
> 2+2.
> See http://wiki.apache.org/hadoop/LimitingTaskSlotUsage
> [0] - http://www.cloudera.com/products-services/tools/ is one.

Harsh J

View raw message