Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of john.lilley@redpoint.net
 designates 206.225.164.221 as permitted sender)
From: John Lilley <john.lilley@redpoint.net>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: Scheduler question
Thread-Topic: Scheduler question
Thread-Index: Ac6s8gwublIdJcwLTs6CzLhbd/A57wAjlzaAAL7glWA=
Date: Fri, 13 Sep 2013 12:22:43 +0000
Message-ID: 
 <869970D71E26D7498BDAC4E1CA92226B86D2E6E6@MBX021-E3-NJ-2.exch021.domain.local>
References: 
 <869970D71E26D7498BDAC4E1CA92226B837B7965@MBX021-E3-NJ-1.exch021.domain.local>
 <CACBYxKKAbPKr7MS_PK6Yc42PdumpPKRxUV7LMmojHnSD5imFSQ@mail.gmail.com>
In-Reply-To: 
 <CACBYxKKAbPKr7MS_PK6Yc42PdumpPKRxUV7LMmojHnSD5imFSQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_"
MIME-Version: 1.0

--_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Thanks!  That makes perfect sense.
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Monday, September 09, 2013 4:17 AM
To: user@hadoop.apache.org
Subject: Re: Scheduler question

Hi John,

YARN schedulers handle this with the concept of "reservations".  Scheduling=
 decisions occur on node heartbeats.  When a node that is full heartbeats, =
the next application that should be able to place a container on it gets to=
 place a "reservation" on it.  Each node has space for a single reservation=
.  Containers for other applications will not be placed on the node until a=
 reservation is fulfilled.

If you are using the Fair Scheduler (Capacity Scheduler works similarly, bu=
t I'm not sure on the specifics), this means that app B would get container=
s far before app A completed, but not soon either.  After app A gets its 20=
 containers, it would get reservations as well on the nodes. After one of a=
pp A's containers finishes on a node, it would get to place another contain=
er on that node to fulfill its reservation.  Then app B would get a reserva=
tion on that node.  Then no containers would be placed on that node until a=
pp B is able to place one, which would be after both of app A's containers =
finish.

It's also possible to configure the schedulers to use preemption to make th=
is kind of thing go a lot faster.

Does that make some sense?

-Sandy

On Mon, Sep 9, 2013 at 7:21 AM, John Lilley <john.lilley@redpoint.net<mailt=
o:john.lilley@redpoint.net>> wrote:
Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the following=
?
Hadoop cluster of 10 nodes, with 8GB each available for containers.  There =
is only one queue.
Application A requests 100 4GB containers.  It initially, or after a little=
 while, gets 20 containers.
Later, application B requests 1 8GB container.
Suppose that App-A's containers each take a few minutes.  At some point one=
 will complete.  When that happens, will the scheduler immediately allocate=
 another 4GB container to App-A?  If so will App-B ever get its container u=
ntil App-A is almost done?
Thanks
John


--_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
span.hoenzb
	{mso-style-name:hoenzb;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri","sans-serif";}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"EN-US" link=3D"blue" vlink=3D"purple">
<div class=3D"WordSection1">
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">Thanks!&nbsp; That makes =
perfect sense.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">john<o:p></o:p></span></p=
>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span><=
/p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:10.0pt;font-family:&quot=
;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span style=3D"font-s=
ize:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> Sandy Ry=
za [mailto:sandy.ryza@cloudera.com]
<br>
<b>Sent:</b> Monday, September 09, 2013 4:17 AM<br>
<b>To:</b> user@hadoop.apache.org<br>
<b>Subject:</b> Re: Scheduler question<o:p></o:p></span></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class=3D"MsoNormal">Hi John,<o:p></o:p></p>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">YARN schedulers handle this with the concept of &quo=
t;reservations&quot;. &nbsp;Scheduling decisions occur on node heartbeats. =
&nbsp;When a node that is full heartbeats, the next application that should=
 be able to place a container on it gets to place a &quot;reservation&quot;
 on it. &nbsp;Each node has space for a single reservation. &nbsp;Container=
s for other applications will not be placed on the node until a reservation=
 is fulfilled.<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">If you are using the Fair Scheduler (Capacity Schedu=
ler works similarly, but I'm not sure on the specifics), this means that ap=
p B would get containers far before app A completed, but not soon either. &=
nbsp;After app A gets its 20 containers,
 it would get reservations as well on the nodes. After one of app A's conta=
iners finishes on a node, it would get to place another container on that n=
ode to fulfill its reservation. &nbsp;Then app B would get a reservation on=
 that node. &nbsp;Then no containers would
 be placed on that node until app B is able to place one, which would be af=
ter both of app A's containers finish.<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">It's also possible to configure the schedulers to us=
e preemption to make this kind of thing go a lot faster.<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">Does that make some sense?<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">-Sandy<o:p></o:p></p>
</div>
</div>
<div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><o:p>&nbsp;</o:p></p>
<div>
<p class=3D"MsoNormal">On Mon, Sep 9, 2013 at 7:21 AM, John Lilley &lt;<a h=
ref=3D"mailto:john.lilley@redpoint.net" target=3D"_blank">john.lilley@redpo=
int.net</a>&gt; wrote:<o:p></o:p></p>
<div>
<div>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Do the Hadoop 2.0 YARN scheduler(s) deal with situations like the =
following?<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Hadoop cluster of 10 nodes, with 8GB each available for containers=
.&nbsp; There is only one queue.<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Application A requests 100 4GB containers.&nbsp; It initially, or =
after a little while, gets 20 containers.<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Later, application B requests 1 8GB container.<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Suppose that App-A&#8217;s containers each take a few minutes.&nbs=
p; At some point one will complete.&nbsp; When that happens, will the sched=
uler immediately allocate another 4GB container to App-A?&nbsp;
 If so will App-B ever get its container until App-A is almost done?<o:p></=
o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Thanks<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto"><span style=3D"color:#888888">John<o:p></o:p></span></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto"><span style=3D"color:#888888">&nbsp;<o:p></o:p></span></p>
</div>
</div>
</div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</div>
</body>
</html>

--_000_869970D71E26D7498BDAC4E1CA92226B86D2E6E6MBX021E3NJ2exch_--