hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xavier Stevens <xstev...@mozilla.com>
Subject Re: Hadoop - how exactly is a slot defined
Date Mon, 22 Nov 2010 16:41:40 GMT
Hi Robert,

Task slots are a way of constraining the amount of resources that will
be used on a given physical machine.  I believe the defaults set map
task slots to 2 and reduce task slots to 1.  Many of us use numbers
higher than this depending on our applications.  When picking a number
you want to keep in mind how many CPU cores and memory you have on a
physical machine.  If you set the child java opts to use 512MB of
memory; that number is per task.  So lets say you have 4 map task slots
and 2 reduce task slots.  Conceivably at some point all 6 slots would be
running tasks at some point which means you need at least 6 x 512MB of
memory plus a little extra for other processes (TaskTracker, DataNode,
etc.) and the OS.

A good starting point is to work backwards from how much physical memory
you have a decide how much you would like for tasks to have available. 
Or if you aren't worried about being memory heavy then you pick the
number of task slots to be close to the number of cores.

Hope this helps!

Cheers,


-Xavier

On 11/22/10 8:32 AM, Grandl Robert wrote:
> Hi all,
>
> I have troubles in understanding what exactly a slot is. Always we are talking about
tasks assigned to slots, but I did not found anywhere what exactly a slot is. I assume it
represent some allocation of RAM memory as well as with some computation power. 
>
> However, can somebody explain me what exactly a slot means (in terms of resources allocated
for a slot) and how this mapping(between slot and physical resources) is done in Hadoop ?
Or give me some hints about the files in the Hadoop  where it may should be ?
>
> Thanks a lot,
> Robert
>
>
>

Mime
View raw message