hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tan Jun" <tanjun_2...@163.com>
Subject Re: Re: About slots of tasktracker and munber of map taskers
Date Mon, 12 Dec 2011 05:29:20 GMT
Harsh,
Sorry for my poor English.
There is one more question.
As an administrator, I can set the max number of maps/reduces run on a datanode,
then what I set the number of slots for?
What's the differences between these attributes?
In my opinion ,the number of  slot depends on hardware while maps/reduces on software.
Assuming that only one job is running, especially for benchmarking case PI computing.
Thanks!




Tan Jun

From: Harsh J
Date: 2011-12-12 13:33
To: mapreduce-user; tanjun_2525
Subject: Re: Re: About slots of tasktracker and munber of map taskers
Tan,

As an admin, I can even choose to set configuration to even 100 slots
on a 4-core node, if I feel like burning the box. There is no hardware
auto-detection, and the slot limit is entirely controlled by the
mapred-site.xml for that TaskTracker.

The book merely tries to tell that you need to set these maximum slot
settings based on your hardware knowledge on each node -- TaskTrackers
do nothing of that sort on their own.

There is some CPU/Memory considerations taken into account by a
variety of non-default Schedulers in JobTracker, but your slot limits
per tasktracker is entirely controlled by configuration.

2011/12/12 Tan Jun <tanjun_2525@163.com>:
> Hi Harsh,
> Now I know the number of maps and reduces run simultaneously is set by the
> administrator in mapred-site.xml with default value 2.
> But I cant get the point about number of slots.
> For my understanding by now,
> the number of?slots is decides by hardware that administrator cannot
> change.
> Is that wright?
>
> ________________________________
> Tan Jun
>
> From: Harsh J
> Date:?011-12-12?2:22
> To: mapreduce-user
> Subject: Re: About slots of tasktracker and munber of map taskers
> Hi Tan,
>
> On 12-Dec-2011, at 8:48 AM, Tan Jun wrote:
>
> Hi,
> I dont really understand the meaning of the sentences in "The Definitive
> Guide"(page 155):
>
> Tasktrackers have a fixed number of slots for map tasks and for reduce tasks: for example,
> a tasktracker may be able to run two map tasks and two reduce tasks simultaneously.
> (The precise number depends on the number of cores and the amount of
> memory on the tasktracker; see “Memory” on page?54.)
>
> Does that mean the number of slots is fixed and the number of maps run
> simultaneously is set by user?
>
>
> Not by the user, but by the administrator. Each tasktracker is configured in
> production with a 'task slot' upper limit - say, 8 maps and 4 reducers for a
> 12-core machine. This is not auto-configured (unless you use auto cluster
> setup+configuration tools that determine it for you [0]), and has to be set
> when configuring Hadoop daemons.
>
> The book means to imply that you need to set these, based on the memory and
> CPU configuration of your machines. By default, tasktrackers have limits of
> 2+2.
>
> See http://wiki.apache.org/hadoop/LimitingTaskSlotUsage
>
> [0] - http://www.cloudera.com/products-services/tools/ is one.



-- 
Harsh J
Mime
View raw message