hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sreekanth Ramakrishnan <>
Subject Re: Using capacity scheduler
Date Fri, 29 Apr 2011 03:09:14 GMT

Currently CapacityScheduler does not have pre-emption. So basically when the Job1 starts finishing
and freeing up the Job2's tasks will start getting scheduled. One way you can prevent that
queue capacities are not elastic in nature is by setting max task limits on queues. That way
your job1 will never execeed first queues capacity

On 4/28/11 11:48 PM, "Rosanna Man" <> wrote:

Hi all,

We are using capacity scheduler to schedule resources among different queues for 1 user (hadoop)
only. We have set the queues to have equal share of the resources. However, when 1st task
starts in the first queue and is consuming all the resources, the 2nd task starts in the 2nd
queue will be starved from reducer until the first task finished. A lot of processing is being
stuck when a large query is executing.

We are using 0.20.2 hive in amazon aws. We tried to use Fair Scheduler before but it gives
an error when the mapper gives no output (which is fine in our use cases).

Anyone can give us some advice?


Sreekanth Ramakrishnan

View raw message