hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From patrick sang <silvianhad...@gmail.com>
Subject capacity scheduler
Date Sat, 15 Oct 2011 23:45:39 GMT
hi hadoopers,

how's your weekend going?
i do run out of idea at this point abt behavior of capacity scheduler; been
stuck with this for a day and night.

I referred to this doc:
http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0/capacity_scheduler.html#Configuring+properties+for+queues

Details of environment set up are at the bottom of this email.

1. created 3 queues: orange 40%, apple 40%, default 20%
(mapred.capacity-scheduler.queue.orange.capacity)
2. only orange queue:
mapred.capacity-scheduler.queue.orange.maximum-capacity = 50

at http://job:50030/, cluster summary shows
- map task capacity = 32
- red task capacity = 12

at http://job:50030/scheduler it shows:
- 3 queues: orange, apple, default
- orange and apple queues: each queue get 12 map tasks capacity, 4 reduce
task capacity
- default queue: get 6map, 2red

Everything seems right up to this point.

Fun part begin.
1. i submitted  4 of wordcount to orange queue.
(each wordcount use 4 map tasks, so 4 of w.c. jobs will use totally 16 map
tasks)

2. default queue and apple queue are both no jobs running

3. 3 of 4 wordcount jobs were started map task right away while map task of
the 4th wordcount job wasn't started.

4. from webUI, scheduling  information of orange queue.

It said "Used capacity: 12 (100.0% of Capacity)"
while next line said "Maximum capacity: 16 slots"
So what's going on with other 4 slots ? why they are not get used.

Is capacity-scheduler supposed to start using extra slots until it hit the
Max capacity ?
(from the variable of
mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity)
(there are no other jobs at all in the cluster)

I am really thankful for reading up to this point.
Truly hope someone can shed some light on this.


P

#########
## Settings,
#########
- 4 machines of CentOS5.6, 8GB, 250GHD
(1 nn + jt, 1 snn, 2 dn)
- CDH3u0
- java version "1.6.0_25"

#########
## Full WebUI of all queues
#########
ORANGE

Queue configuration
Capacity Percentage: 40.0%
User Limit: 100%
Priority Supported: YES
-------------
Map tasks
Capacity: 12 slots
Maximum capacity: 16 slots
Used capacity: 12 (100.0% of Capacity)
Running tasks: 12
Active users:
User 'apps': 12 (100.0% of used capacity)
-------------
Reduce tasks
Capacity: 4 slots
Maximum capacity: 6 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 1

APPLE

Queue configuration
Capacity Percentage: 20.0%
User Limit: 100%
Priority Supported: NO
-------------
Map tasks
Capacity: 6 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Reduce tasks
Capacity: 2 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0

DEFAULT
Queue configuration
Capacity Percentage: 40.0%
User Limit: 100%
Priority Supported: YES
-------------
Map tasks
Capacity: 12 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Reduce tasks
Capacity: 4 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0

#########
## capacity-scheduler.xml
#########
<configuration>
<!-- Global config -->
 <property>
    <name>mapred.capacity-scheduler.init-poll-interval</name>
    <value>5000</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.init-worker-threads</name>
    <value>20</value>
  </property>


<!-- Queeu: default -->
  <property>
    <name>mapred.capacity-scheduler.queue.default.capacity</name>
    <value>20</value>
  </property>

<!-- Queue: orange -->
  <property>
    <name>mapred.capacity-scheduler.queue.orange.capacity</name>
    <value>40</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.queue.orange.maximum-capacity</name>
    <value>50</value>
  </property>
  <property>

<name>mapred.capacity-scheduler.queue.orange.maximum-initialized-jobs-per-user</name>
    <value>5000</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.queue.orange.supports-priority</name>
    <value>true</value>
  </property>

<!-- Queue: apple -->
  <property>
    <name>mapred.capacity-scheduler.queue.apple.capacity</name>
    <value>40</value>
  </property>
  <property>

<name>mapred.capacity-scheduler.queue.orange.maximum-initialized-jobs-per-user</name>
    <value>5000</value>
  </property>
 <!-- <property>
    <name>mapred.capacity-scheduler.queue.apple.maximum-capacity</name>
    <value>-1</value>
  </property> -->
  <property>
    <name>mapred.capacity-scheduler.queue.apple.supports-priority</name>
    <value>true</value>
  </property>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message