hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Scheduler question
Date Fri, 13 Sep 2013 12:22:43 GMT
Thanks!  That makes perfect sense.
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Monday, September 09, 2013 4:17 AM
To: user@hadoop.apache.org
Subject: Re: Scheduler question

Hi John,

YARN schedulers handle this with the concept of "reservations".  Scheduling decisions occur
on node heartbeats.  When a node that is full heartbeats, the next application that should
be able to place a container on it gets to place a "reservation" on it.  Each node has space
for a single reservation.  Containers for other applications will not be placed on the node
until a reservation is fulfilled.

If you are using the Fair Scheduler (Capacity Scheduler works similarly, but I'm not sure
on the specifics), this means that app B would get containers far before app A completed,
but not soon either.  After app A gets its 20 containers, it would get reservations as well
on the nodes. After one of app A's containers finishes on a node, it would get to place another
container on that node to fulfill its reservation.  Then app B would get a reservation on
that node.  Then no containers would be placed on that node until app B is able to place one,
which would be after both of app A's containers finish.

It's also possible to configure the schedulers to use preemption to make this kind of thing
go a lot faster.

Does that make some sense?

-Sandy

On Mon, Sep 9, 2013 at 7:21 AM, John Lilley <john.lilley@redpoint.net<mailto:john.lilley@redpoint.net>>
wrote:
Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the following?
Hadoop cluster of 10 nodes, with 8GB each available for containers.  There is only one queue.
Application A requests 100 4GB containers.  It initially, or after a little while, gets 20
containers.
Later, application B requests 1 8GB container.
Suppose that App-A's containers each take a few minutes.  At some point one will complete.
 When that happens, will the scheduler immediately allocate another 4GB container to App-A?
 If so will App-B ever get its container until App-A is almost done?
Thanks
John



Mime
View raw message