Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 75B69CE1A for ; Thu, 10 May 2012 13:30:36 +0000 (UTC) Received: (qmail 6415 invoked by uid 500); 10 May 2012 13:30:35 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 6193 invoked by uid 500); 10 May 2012 13:30:35 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 6183 invoked by uid 99); 10 May 2012 13:30:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 13:30:34 +0000 X-ASF-Spam-Status: No, hits=2.6 required=5.0 tests=HTML_MESSAGE,NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.145.54.172 is neither permitted nor denied by domain of evans@yahoo-inc.com) Received: from [216.145.54.172] (HELO mrout2.yahoo.com) (216.145.54.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 13:30:27 +0000 Received: from SP1-EX07CAS04.ds.corp.yahoo.com (sp1-ex07cas04.corp.sp1.yahoo.com [216.252.116.155]) by mrout2.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id q4ADTuTL049668 for ; Thu, 10 May 2012 06:29:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yahoo-inc.com; s=cobra; t=1336656596; bh=/A0FD6PtLV/Xa5vBGWT2eUcPHX0e4jcm33iPDFToyqk=; h=From:To:Date:Subject:Message-ID:In-Reply-To:Content-Type: MIME-Version; b=bIpTCWITLFTzKJHgKXhExuQBnnoG6LR9fqP/9QpuJA8A49rhgICE9JLTDRHSoPGnx fwgSRtlVBFyHClmDfCjgb5pBgw1x3BDPXeN6ttbN7afQ0qzkZHG7c6qg4dgkPGltNe 5jgxEJRSypQP0wGcD9PTrA/xP+ppz4Z7dpxlfKwA= Received: from SP1-EX07VS02.ds.corp.yahoo.com ([216.252.116.135]) by SP1-EX07CAS04.ds.corp.yahoo.com ([216.252.116.158]) with mapi; Thu, 10 May 2012 08:29:56 -0500 From: Robert Evans To: "mapreduce-user@hadoop.apache.org" Date: Thu, 10 May 2012 08:29:55 -0500 Subject: Re: max 1 mapper per node Thread-Topic: max 1 mapper per node Thread-Index: Ac0upBSzXntP33cXQmWHjUlV9/ZGVAADOfc7 Message-ID: In-Reply-To: <4FABACE3.9010203@filez.com> Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_CBD12D033A231evansyahooinccom_" MIME-Version: 1.0 --_000_CBD12D033A231evansyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Yes adding in more resources in the scheduling request would be the ideal s= olution to the problem. But sadly that is not a trivial change. The initi= al solution I suggested is an ugly hack, and will not work for the cases yo= u have suggested. If you feel that this is important work please feel free= to file a JIRA for this. We can continue discussion on that JIRA about t= he details of how to add in this type of functionality. I am very interest= ed in the scheduler and would be happy to help out, but sadly my time right= now is very limited. --Bobby Evans On 5/10/12 6:56 AM, "Radim Kolar" wrote: > We've been against these 'features' since it leads to very bad > behaviour across the cluster with multiple apps/users etc. Its not new feature, its extension of existing resource scheduling which works good enough only for RAM. There are 2 other resources - CPU cores and network IO which needs to be considered. We have job which is doing lot of network IO in mapper and its desirable to run mappers on different nodes even if reading blocks from HDFS will not be local. Our second job is burning all CPU cores on machine while doing computations, its important for mappers not to land on same node. --_000_CBD12D033A231evansyahooinccom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: max 1 mapper per node Yes adding in more resources in the scheduling request would be the i= deal solution to the problem.  But sadly that is not a trivial change.=  The initial solution I suggested is an ugly hack, and will not work = for the cases you have suggested.  If you feel that this is important = work please feel free to file a JIRA for this.  We can continue discus= sion on that JIRA about  the details of how to add in this type of fun= ctionality.  I am very interested in the scheduler and would be happy = to help out, but sadly my time right now is very limited.

--Bobby Evans

On 5/10/12 6:56 AM, "Radim Kolar" <h= sn@filez.com> wrote:



> We've been against these 'features' since it leads to very bad
> behaviour across the cluster with multiple apps/users etc.
Its not new feature, its extension of existing resource scheduling which works good enough only for RAM. There are 2 other resources - CPU cores
and network IO which needs to be considered.

We have job which is doing lot of network IO in mapper and its desirable to run mappers on different nodes even if reading blocks from HDFS will
not be local.

Our second job is burning all CPU cores on machine while doing
computations, its important for mappers not to land on same node.

--_000_CBD12D033A231evansyahooinccom_--