Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 413EB1041D for ; Fri, 13 Sep 2013 16:04:47 +0000 (UTC) Received: (qmail 58156 invoked by uid 500); 13 Sep 2013 12:23:10 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 57687 invoked by uid 500); 13 Sep 2013 12:23:10 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 57438 invoked by uid 99); 13 Sep 2013 12:23:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Sep 2013 12:23:09 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of john.lilley@redpoint.net designates 206.225.164.221 as permitted sender) Received: from [206.225.164.221] (HELO hub021-nj-5.exch021.serverdata.net) (206.225.164.221) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Sep 2013 12:23:04 +0000 Received: from MBX021-E3-NJ-2.exch021.domain.local ([10.240.4.78]) by HUB021-NJ-5.exch021.domain.local ([10.240.4.89]) with mapi id 14.03.0123.003; Fri, 13 Sep 2013 05:22:43 -0700 From: John Lilley To: "user@hadoop.apache.org" Subject: RE: Scheduler question Thread-Topic: Scheduler question Thread-Index: Ac6s8gwublIdJcwLTs6CzLhbd/A57wAjlzaAAL7glWA= Date: Fri, 13 Sep 2013 12:22:43 +0000 Message-ID: <869970D71E26D7498BDAC4E1CA92226B86D2E6E6@MBX021-E3-NJ-2.exch021.domain.local> References: <869970D71E26D7498BDAC4E1CA92226B837B7965@MBX021-E3-NJ-1.exch021.domain.local> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [173.160.43.61] Content-Type: multipart/alternative; boundary="_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks! That makes perfect sense. john From: Sandy Ryza [mailto:sandy.ryza@cloudera.com] Sent: Monday, September 09, 2013 4:17 AM To: user@hadoop.apache.org Subject: Re: Scheduler question Hi John, YARN schedulers handle this with the concept of "reservations". Scheduling= decisions occur on node heartbeats. When a node that is full heartbeats, = the next application that should be able to place a container on it gets to= place a "reservation" on it. Each node has space for a single reservation= . Containers for other applications will not be placed on the node until a= reservation is fulfilled. If you are using the Fair Scheduler (Capacity Scheduler works similarly, bu= t I'm not sure on the specifics), this means that app B would get container= s far before app A completed, but not soon either. After app A gets its 20= containers, it would get reservations as well on the nodes. After one of a= pp A's containers finishes on a node, it would get to place another contain= er on that node to fulfill its reservation. Then app B would get a reserva= tion on that node. Then no containers would be placed on that node until a= pp B is able to place one, which would be after both of app A's containers = finish. It's also possible to configure the schedulers to use preemption to make th= is kind of thing go a lot faster. Does that make some sense? -Sandy On Mon, Sep 9, 2013 at 7:21 AM, John Lilley > wrote: Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the following= ? Hadoop cluster of 10 nodes, with 8GB each available for containers. There = is only one queue. Application A requests 100 4GB containers. It initially, or after a little= while, gets 20 containers. Later, application B requests 1 8GB container. Suppose that App-A's containers each take a few minutes. At some point one= will complete. When that happens, will the scheduler immediately allocate= another 4GB container to App-A? If so will App-B ever get its container u= ntil App-A is almost done? Thanks John --_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Thanks!  That makes = perfect sense.

john

 <= /p>

From: Sandy Ry= za [mailto:sandy.ryza@cloudera.com]
Sent: Monday, September 09, 2013 4:17 AM
To: user@hadoop.apache.org
Subject: Re: Scheduler question

 

Hi John,

 

YARN schedulers handle this with the concept of &quo= t;reservations".  Scheduling decisions occur on node heartbeats. =  When a node that is full heartbeats, the next application that should= be able to place a container on it gets to place a "reservation" on it.  Each node has space for a single reservation.  Container= s for other applications will not be placed on the node until a reservation= is fulfilled.

 

If you are using the Fair Scheduler (Capacity Schedu= ler works similarly, but I'm not sure on the specifics), this means that ap= p B would get containers far before app A completed, but not soon either. &= nbsp;After app A gets its 20 containers, it would get reservations as well on the nodes. After one of app A's conta= iners finishes on a node, it would get to place another container on that n= ode to fulfill its reservation.  Then app B would get a reservation on= that node.  Then no containers would be placed on that node until app B is able to place one, which would be af= ter both of app A's containers finish.

 

It's also possible to configure the schedulers to us= e preemption to make this kind of thing go a lot faster.

 

Does that make some sense?

 

-Sandy

 

On Mon, Sep 9, 2013 at 7:21 AM, John Lilley <john.lilley@redpo= int.net> wrote:

Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the = following?

Hadoop cluster of 10 nodes, with 8GB each available for containers= .  There is only one queue.

Application A requests 100 4GB containers.  It initially, or = after a little while, gets 20 containers.

Later, application B requests 1 8GB container.

Suppose that App-A’s containers each take a few minutes.&nbs= p; At some point one will complete.  When that happens, will the sched= uler immediately allocate another 4GB container to App-A?  If so will App-B ever get its container until App-A is almost done?

Thanks

John

 

 

--_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_--