Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: German Florez-Larrahondo <german.fl@samsung.com>
To: user@hadoop.apache.org
References: 
 <CADoVA9oWP8Swj0cCWgO3EDaOEG-gz5T6zub-YLiuPp=P_-_+Ww@mail.gmail.com>
 <CALte62znveyvOfUGkRXcLDOzYrppzk3pONmKOnoNA3iHS_nPEQ@mail.gmail.com>
In-reply-to: 
 <CALte62znveyvOfUGkRXcLDOzYrppzk3pONmKOnoNA3iHS_nPEQ@mail.gmail.com>
Subject: RE: expressing job anti-affinity in Yarn.
Date: Thu, 09 Jan 2014 11:23:15 -0600
Message-id: <008f01cf0d5f$7b91f150$72b5d3f0$@samsung.com>
MIME-version: 1.0
Content-type: multipart/alternative;
 boundary="----=_NextPart_000_0090_01CF0D2D.30F844A0"
Thread-index: AQF2q9NzvcWB86rHfvlKjR8ZbxnNIAH1unfjmx25UxA=
Content-language: en-us

This is a multipart message in MIME format.

------=_NextPart_000_0090_01CF0D2D.30F844A0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Ted

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject

 
From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes

 
Ashish

Could this be related to the scheduler you are using and its settings?.

 
On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.

 
You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)

 
I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 

 
<property>

  <name>yarn.resourcemanager.scheduler.class</name>

 
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>

</property>

 
Regards

./g

 
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.

 
See:

YARN-1042 add ability to specify affinity/anti-affinity in container
requests

 
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <rickylee0815@gmail.com> wrote:

Hi all,

 
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

 
thanks.

 
------=_NextPart_000_0090_01CF0D2D.30F844A0
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40"><head><META =
HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii"><meta name=3DGenerator content=3D"Microsoft Word 14 =
(filtered medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:Verdana;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri","sans-serif";}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DEN-US link=3Dblue =
vlink=3Dpurple><div class=3DWordSection1><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'>Ted<o:p></o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'>You could try with the fairscheduler as well. See a comment I made a =
few hours ago on the same subject<o:p></o:p></span></p><p =
class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><b><span =
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span>=
</b><span style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> =
German Florez-Larrahondo [mailto:german.fl@samsung.com] <br><b>Sent:</b> =
Thursday, January 09, 2014 8:23 AM<br><b>To:</b> =
user@hadoop.apache.org<br><b>Subject:</b> RE: Distributing the code to =
multiple nodes<o:p></o:p></span></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal>Ashish<o:p></o:p></p><p class=3DMsoNormal>Could this =
be related to the scheduler you are using and its =
settings?.<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal>On lab environments when running a single type of job =
I often use FairScheduler (the YARN default in 2.2.0 is =
CapacityScheduler) and it does a good job distributing the =
load.<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal>You could give that a try (<a =
href=3D"https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-si=
te/FairScheduler.html">https://hadoop.apache.org/docs/current/hadoop-yarn=
/hadoop-yarn-site/FairScheduler.html</a><span =
style=3D'color:#1F497D'>)</span><o:p></o:p></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>I think just =
changing yarn-site.xml &nbsp;as follows could demonstrate this theory =
(note that &nbsp;how the jobs are scheduled depend on resources such as =
memory on the nodes and you would need to setup yarn-site.xml =
accordingly). <o:p></o:p></p><p class=3DMsoNormal><span =
style=3D'font-size:9.0pt;font-family:"Verdana","sans-serif";color:black'>=
<o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal =
style=3D'background:white'><span =
style=3D'font-size:10.0pt;font-family:"Courier =
New";color:black'>&lt;property&gt;<o:p></o:p></span></p><p =
class=3DMsoNormal style=3D'background:white'><span =
style=3D'font-size:10.0pt;font-family:"Courier New";color:black'>&nbsp; =
&lt;name&gt;yarn.resourcemanager.scheduler.class&lt;/name&gt;<o:p></o:p><=
/span></p><p class=3DMsoNormal style=3D'background:white'><span =
style=3D'font-size:10.0pt;font-family:"Courier New";color:black'>&nbsp; =
&lt;value&gt;org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair=
.FairScheduler&lt;/value&gt;<o:p></o:p></span></p><p class=3DMsoNormal =
style=3D'background:white'><span =
style=3D'font-size:10.0pt;font-family:"Courier =
New";color:black'>&lt;/property&gt;<o:p></o:p></span></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><p =
class=3DMsoNormal>Regards<o:p></o:p></p><p =
class=3DMsoNormal>./g<o:p></o:p></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span =
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497=
D'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><b><span =
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span>=
</b><span style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> =
Ted Yu [mailto:yuzhihong@gmail.com] <br><b>Sent:</b> Thursday, January =
09, 2014 11:00 AM<br><b>To:</b> =
common-user@hadoop.apache.org<br><b>Subject:</b> Re: expressing job =
anti-affinity in Yarn.<o:p></o:p></span></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p><div><p =
class=3DMsoNormal>See:<o:p></o:p></p><div><p class=3DMsoNormal>YARN-1042 =
add ability to specify affinity/anti-affinity in container =
requests<o:p></o:p></p></div></div><div><p class=3DMsoNormal =
style=3D'margin-bottom:12.0pt'><o:p>&nbsp;</o:p></p><div><p =
class=3DMsoNormal>On Thu, Jan 9, 2014 at 8:48 AM, ricky l &lt;<a =
href=3D"mailto:rickylee0815@gmail.com" =
target=3D"_blank">rickylee0815@gmail.com</a>&gt; =
wrote:<o:p></o:p></p><div><p class=3DMsoNormal>Hi =
all,<o:p></o:p></p><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>Is it possible to express the job anti-affinity in the =
Yarn-based hadoop? I have a job that is very IO-intensive, and I want to =
spread the tasks across all available machines. In a default Yarn RM =
scheduler, it seems many tasks are scheduled in one machine while others =
are idle.<o:p></o:p></p></div><div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div><div><p =
class=3DMsoNormal>thanks.<o:p></o:p></p></div></div></div><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div></div></body></html>
------=_NextPart_000_0090_01CF0D2D.30F844A0--