Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7052110C10 for ; Thu, 9 Jan 2014 17:28:18 +0000 (UTC) Received: (qmail 88107 invoked by uid 500); 9 Jan 2014 17:24:45 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 87907 invoked by uid 500); 9 Jan 2014 17:24:11 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 87501 invoked by uid 99); 9 Jan 2014 17:23:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 17:23:48 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [211.189.100.14] (HELO usmailout4.samsung.com) (211.189.100.14) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 17:23:39 +0000 Received: from uscpsbgm1.samsung.com (u114.gpu85.samsung.co.kr [203.254.195.114]) by usmailout4.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0MZ500H4C9MSQU70@usmailout4.samsung.com> for user@hadoop.apache.org; Thu, 09 Jan 2014 12:23:16 -0500 (EST) X-AuditID: cbfec372-b7fa96d000006a7b-6a-52cedb04c898 Received: from ussync4.samsung.com ( [203.254.195.84]) by uscpsbgm1.samsung.com (USCPMTA) with SMTP id BE.FF.27259.40BDEC25; Thu, 09 Jan 2014 12:23:16 -0500 (EST) Received: from lgflarrahondo ([105.140.33.168]) by ussync4.samsung.com (Oracle Communications Messaging Server 7u4-24.01 (7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTPA id <0MZ500E5J9MSD440@ussync4.samsung.com> for user@hadoop.apache.org; Thu, 09 Jan 2014 12:23:16 -0500 (EST) From: German Florez-Larrahondo To: user@hadoop.apache.org References: In-reply-to: Subject: RE: expressing job anti-affinity in Yarn. Date: Thu, 09 Jan 2014 11:23:15 -0600 Message-id: <008f01cf0d5f$7b91f150$72b5d3f0$@samsung.com> MIME-version: 1.0 Content-type: multipart/alternative; boundary="----=_NextPart_000_0090_01CF0D2D.30F844A0" X-Mailer: Microsoft Outlook 14.0 Thread-index: AQF2q9NzvcWB86rHfvlKjR8ZbxnNIAH1unfjmx25UxA= Content-language: en-us X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrDLMWRmVeSWpSXmKPExsVy+t/hEF2W2+eCDPq3mln0TJnG4sDoMaFr C2MAYxSXTUpqTmZZapG+XQJXxr/euewFv90qLt/MbmDcatfFyMkhIWAi8eFqMzuELSZx4d56 ti5GLg4hgSWMEmcPb2CEcBYwSfxs7WIDqWITMJP43dHADGKLCEhJdL+ZzARRNJNRYkL7XCaQ BKdAsMTv3plgRcICRhJHr30CW8EioCqx+MIesBpeAUuJnpnNzBC2oMSPyfdYQGxmgWiJM68a mSFOUpDYcfY1I8QyK4mNix4xQ9SIS0x68JB9AqPALCTts5C0z0JSNouRA8jWk2jbyAgRlpfY /nYOVImuxP/nMLa2xLKFr5kXMLKvYhQtLU4uKE5KzzXUK07MLS7NS9dLzs/dxAgJ8aIdjM82 WB1iFOBgVOLhZSg7GyTEmlhWXJl7iFGCg1lJhHfNxXNBQrwpiZVVqUX58UWlOanFhxiZODil Ghh7qlLqwww/Ob98fzdefN1x87vb9NOeBs8vlFjy9uPEabWMT01O/V6RUrvA/fDnprs9q67t NP3ieb+Ov6VhdiJr6tO0ByKLvj3UkWGZViA/9aTFRYeOLY97WXcx3/hf56Hj/JStqdmcQ1nj 1X+hZcXe09fq8ixc7FNldvP+9P7a1Y0PPdKErmsosRRnJBpqMRcVJwIABkIWMU8CAAA= X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_0090_01CF0D2D.30F844A0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Ted You could try with the fairscheduler as well. See a comment I made a few hours ago on the same subject From: German Florez-Larrahondo [mailto:german.fl@samsung.com] Sent: Thursday, January 09, 2014 8:23 AM To: user@hadoop.apache.org Subject: RE: Distributing the code to multiple nodes Ashish Could this be related to the scheduler you are using and its settings?. On lab environments when running a single type of job I often use FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a good job distributing the load. You could give that a try (https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch eduler.html) I think just changing yarn-site.xml as follows could demonstrate this theory (note that how the jobs are scheduled depend on resources such as memory on the nodes and you would need to setup yarn-site.xml accordingly). yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche duler Regards ./g From: Ted Yu [mailto:yuzhihong@gmail.com] Sent: Thursday, January 09, 2014 11:00 AM To: common-user@hadoop.apache.org Subject: Re: expressing job anti-affinity in Yarn. See: YARN-1042 add ability to specify affinity/anti-affinity in container requests On Thu, Jan 9, 2014 at 8:48 AM, ricky l wrote: Hi all, Is it possible to express the job anti-affinity in the Yarn-based hadoop? I have a job that is very IO-intensive, and I want to spread the tasks across all available machines. In a default Yarn RM scheduler, it seems many tasks are scheduled in one machine while others are idle. thanks. ------=_NextPart_000_0090_01CF0D2D.30F844A0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Ted

You could try with the fairscheduler as well. See a comment I made a = few hours ago on the same subject

 

From:= = German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: = Thursday, January 09, 2014 8:23 AM
To: = user@hadoop.apache.org
Subject: RE: Distributing the code to = multiple nodes

 

Ashish

Could this = be related to the scheduler you are using and its = settings?.

 

On lab environments when running a single type of job = I often use FairScheduler (the YARN default in 2.2.0 is = CapacityScheduler) and it does a good job distributing the = load.

 

You could give that a try (https://hadoop.apache.org/docs/current/hadoop-yarn= /hadoop-yarn-site/FairScheduler.html)

 

I think just = changing yarn-site.xml  as follows could demonstrate this theory = (note that  how the jobs are scheduled depend on resources such as = memory on the nodes and you would need to setup yarn-site.xml = accordingly).

=  

<property>

  = <name>yarn.resourcemanager.scheduler.class</name><= /span>

  = <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair= .FairScheduler</value>

</property>

 

Regards

./g

 

 

 

From:= = Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, January = 09, 2014 11:00 AM
To: = common-user@hadoop.apache.org
Subject: Re: expressing job = anti-affinity in Yarn.

 

See:

YARN-1042 = add ability to specify affinity/anti-affinity in container = requests

 

On Thu, Jan 9, 2014 at 8:48 AM, ricky l <rickylee0815@gmail.com> = wrote:

Hi = all,

 

Is it possible to express the job anti-affinity in the = Yarn-based hadoop? I have a job that is very IO-intensive, and I want to = spread the tasks across all available machines. In a default Yarn RM = scheduler, it seems many tasks are scheduled in one machine while others = are idle.

 

thanks.

 

------=_NextPart_000_0090_01CF0D2D.30F844A0--