hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From German Florez-Larrahondo <german...@samsung.com>
Subject RE: expressing job anti-affinity in Yarn.
Date Thu, 09 Jan 2014 17:23:15 GMT

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject


From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes



Could this be related to the scheduler you are using and its settings?.


On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.


You could give that a try


I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 












From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.



YARN-1042 add ability to specify affinity/anti-affinity in container


On Thu, Jan 9, 2014 at 8:48 AM, ricky l <rickylee0815@gmail.com> wrote:

Hi all,


Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.




View raw message