hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasilis Liaskovitis <vlias...@gmail.com>
Subject default job scheduler behaviour
Date Sun, 27 Sep 2009 01:37:34 GMT

given a single cluster running with the default job scheduler: Is only
one job executing on the cluster, regardless of how many task
map/reduce slots it can keep busy?
In other words, If a job does not use all task slots, would the
default scheduler consider scheduling map/reduce from other jobs that
have already been submitted to the system?

I am using an 8-node cluster to run some test jobs based on gridmix
(the synthetic benchmark found in the hadoop distribution under
src/banchmarks/gridmix). The gridmix workload submits many different
jobs in parallel - 5 different kinds of jobs of varying sizes for each
kind: small, medium, large. While running, I am noticing that at any
time only one job is making progress - at least according to the
jobtracker web ui. I think this is happening even for small-size jobs,
which don't take up all slots of the cluster's tasktrackers/nodes.

If the default scheduler is not capable of scheduling tasks for
multiple jobs, would I have to use the capacity scheduler? Or
something else?

thanks for any help,

- Vasilis

View raw message