Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 88BB2186BD for ; Mon, 23 Nov 2015 15:58:52 +0000 (UTC) Received: (qmail 82338 invoked by uid 500); 23 Nov 2015 15:58:47 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 82240 invoked by uid 500); 23 Nov 2015 15:58:47 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82225 invoked by uid 99); 23 Nov 2015 15:58:47 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Nov 2015 15:58:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id CFA251A0B30 for ; Mon, 23 Nov 2015 15:58:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.997 X-Spam-Level: ** X-Spam-Status: No, score=2.997 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id RHYk6zSMgOZl for ; Mon, 23 Nov 2015 15:58:45 +0000 (UTC) Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2on0076.outbound.protection.outlook.com [207.46.100.76]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 8CA3B205C2 for ; Mon, 23 Nov 2015 15:58:45 +0000 (UTC) Received: from BY2PR11MB0824.namprd11.prod.outlook.com (10.163.112.18) by BY2PR11MB0823.namprd11.prod.outlook.com (10.163.112.17) with Microsoft SMTP Server (TLS) id 15.1.331.20; Mon, 23 Nov 2015 15:58:43 +0000 Received: from BY2PR11MB0824.namprd11.prod.outlook.com ([10.163.112.18]) by BY2PR11MB0824.namprd11.prod.outlook.com ([10.163.112.18]) with mapi id 15.01.0331.019; Mon, 23 Nov 2015 15:58:43 +0000 From: Nicolae Marasoiu To: "user@hadoop.apache.org" Subject: yarn does not allocate enough tasks/containers to my available node Thread-Topic: yarn does not allocate enough tasks/containers to my available node Thread-Index: AQHRJgcDECGGQhiNyk+WTOHuy+Xapw== Date: Mon, 23 Nov 2015 15:58:43 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=nicolae.marasoiu@adswizz.com; x-originating-ip: [31.14.160.130] x-microsoft-exchange-diagnostics: 1;BY2PR11MB0823;5:fi53RVYIFY/GTIIWvGCo0UtkUSgLaCtxlo+JatwzQevpsZ2ARSRyGKfO48wwdsjrZLnP+EgmtZW4T0uudq6syFJt2umdaq5JAvhp7uXg2sDLJ4KyF7LQQeOnhCuy3ZNzXXBNpsGzZbx+nKMEs3tGLA==;24:LgW9ade1gJg9XG7jSuJ7EqtGQ0qsQk2xOIP06PZpDp7bapDzfo+DGRF/iTr9GLyZXyP480wt5xObS5gjKHQLYBtDZWDhhflrCT+R33BN4vM= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(42139001);SRVR:BY2PR11MB0823; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(520078)(10201501046)(3002001);SRVR:BY2PR11MB0823;BCL:0;PCL:0;RULEID:;SRVR:BY2PR11MB0823; x-forefront-prvs: 07697999E6 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(199003)(189002)(122556002)(40100003)(77096005)(5003600100002)(101416001)(97736004)(106356001)(5002640100001)(10400500002)(54356999)(2900100001)(81156007)(19625215002)(2501003)(450100001)(50986999)(33656002)(189998001)(76576001)(5001960100002)(110136002)(99286002)(87936001)(106116001)(107886002)(105586002)(3846002)(74316001)(92566002)(586003)(102836003)(66066001)(19627405001)(5008740100001)(16236675004)(5007970100001)(11100500001)(5004730100002)(6116002)(86362001)(229853001)(2351001);DIR:OUT;SFP:1101;SCL:1;SRVR:BY2PR11MB0823;H:BY2PR11MB0824.namprd11.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; received-spf: None (protection.outlook.com: adswizz.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: multipart/alternative; boundary="_000_BY2PR11MB08243D197441D36254E3627F8F070BY2PR11MB0824namp_" MIME-Version: 1.0 X-OriginatorOrg: adswizz.com X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Nov 2015 15:58:43.0106 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: d1ae4f29-fb22-47cb-b597-403b799a1628 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR11MB0823 --_000_BY2PR11MB08243D197441D36254E3627F8F070BY2PR11MB0824namp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, Tasks are allocated to my nodes by memory. Initially they are allocated ok across the cluster. After a while, one of the nodes does not receive new tasks fast enough: it = gets to 0 tasks and from time to time I see it having 1 task which it finis= hed in seconds. It is true that I currently have a problem of many small input files. And probably the fact that the nodes are oversubscribed in cpu by a factor = of 2-3 (according to load average) is not helping. But 1. why does yarn not able to bulk allocate some 4 tasks on the idle nod= e at once (not one by one), and 2. why yarn is slow in allocating tasks? (I= understand that allocating a new task/container in a few seconds may/may n= ot be considered slow). Pls advise, Nicu --_000_BY2PR11MB08243D197441D36254E3627F8F070BY2PR11MB0824namp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Hi,


Tasks are allocated to my nodes by memory.

Initially they are allocated ok across the cluster.

After a while, one of the nodes does not receive new tasks fast enough: = it gets to 0 tasks and from time to time I see it having 1 task which it fi= nished in seconds.


It is true that I currently have a problem of many small input files.&nb= sp;

And probably the fact that the nodes are oversubscribed in cpu by a fact= or of 2-3 (according to load average) is not helping.


But 1. why does yarn not able to bulk allocate some 4 tasks on the = idle node at once (not one by one), and 2. why yarn is slow in allocating t= asks? (I understand that allocating a new task/container in a few seconds m= ay/may not be considered slow).


Pls advise,

Nicu

--_000_BY2PR11MB08243D197441D36254E3627F8F070BY2PR11MB0824namp_--