Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 097BA1029D for ; Wed, 29 Jan 2014 13:19:15 +0000 (UTC) Received: (qmail 12228 invoked by uid 500); 29 Jan 2014 13:19:06 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 12090 invoked by uid 500); 29 Jan 2014 13:19:06 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 11684 invoked by uid 99); 29 Jan 2014 13:19:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jan 2014 13:19:05 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of java8964@hotmail.com designates 65.55.90.82 as permitted sender) Received: from [65.55.90.82] (HELO snt0-omc2-s7.snt0.hotmail.com) (65.55.90.82) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jan 2014 13:18:58 +0000 Received: from SNT149-W45 ([65.55.90.73]) by snt0-omc2-s7.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 29 Jan 2014 05:18:37 -0800 X-TMN: [VzMqgJIUeKkyOJgaa49a3ItSpqxypkNB] X-Originating-Email: [java8964@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_4bbae670-9632-4b91-9c26-3f64249a0a81_" From: java8964 To: "user@hadoop.apache.org" Subject: RE: Force one mapper per machine (not core)? Date: Wed, 29 Jan 2014 08:18:37 -0500 Importance: Normal In-Reply-To: <9EACF911-AFAA-409D-9411-7F9DB2591F9F@keithwiley.com> References: <280A85F8-DBB1-483D-843D-0D99C3356689@keithwiley.com> ,<9EACF911-AFAA-409D-9411-7F9DB2591F9F@keithwiley.com> MIME-Version: 1.0 X-OriginalArrivalTime: 29 Jan 2014 13:18:37.0980 (UTC) FILETIME=[9EDB3DC0:01CF1CF4] X-Virus-Checked: Checked by ClamAV on apache.org --_4bbae670-9632-4b91-9c26-3f64249a0a81_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Or you can implement your own InputSplit and InputFormat=2C which you can c= ontrol how to send tasks to which node=2C and how many per node. Some detail examples you can get from book "Professional Hadoop Solution" C= haracter 4. Yong > Subject: Re: Force one mapper per machine (not core)? > From: kwiley@keithwiley.com > Date: Tue=2C 28 Jan 2014 15:41:22 -0800 > To: user@hadoop.apache.org >=20 > Yeah=2C it isn't=2C not even remotely=2C but thanks. >=20 > On Jan 28=2C 2014=2C at 14:06 =2C Bryan Beaudreault wrote: >=20 > > If this cluster is being used exclusively for this goal=2C you could ju= st set the mapred.tasktracker.map.tasks.maximum to 1. > >=20 > >=20 > > On Tue=2C Jan 28=2C 2014 at 5:00 PM=2C Keith Wiley wrote: > > I'm running a program which in the streaming layer automatically multit= hreads and does so by automatically detecting the number of cores on the ma= chine. I realize this model is somewhat in conflict with Hadoop=2C but non= etheless=2C that's what I'm doing. Thus=2C for even resource utilization= =2C it would be nice to not only assign one mapper per core=2C but only one= mapper per machine. I realize that if I saturate the cluster none of this= really matters=2C but consider the following example for clarity: 4-core n= odes=2C 10-node cluster=2C thus 40 slots=2C fully configured across mappers= and reducers (40 slots of each). Say I run this program with just two map= pers. It would run much more efficiently (in essentially half the time) if= I could force the two mappers to go to slots on two separate machines inst= ead of running the risk that Hadoop may assign them both to the same machin= e. > >=20 > > Can this be done? > >=20 > > Thanks. >=20 > _________________________________________________________________________= _______ > Keith Wiley kwiley@keithwiley.com keithwiley.com music.keithwi= ley.com >=20 > "Luminous beings are we=2C not this crude matter." > -- Yoda > _________________________________________________________________________= _______ >=20 = --_4bbae670-9632-4b91-9c26-3f64249a0a81_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Or you can implement your own In= putSplit and InputFormat=2C which you can control how to send tasks to whic= h node=2C and how many per node.

Some detail examples yo= u can get from book "Professional Hadoop Solution" Character 4.
<= br>
Yong

>=3B Subject: Re: Force one mapper per mac= hine (not core)?
>=3B From: kwiley@keithwiley.com
>=3B Date: Tue= =2C 28 Jan 2014 15:41:22 -0800
>=3B To: user@hadoop.apache.org
>= =3B
>=3B Yeah=2C it isn't=2C not even remotely=2C but thanks.
>= =3B
>=3B On Jan 28=2C 2014=2C at 14:06 =2C Bryan Beaudreault wrote:>=3B
>=3B >=3B If this cluster is being used exclusively for th= is goal=2C you could just set the mapred.tasktracker.map.tasks.maximum to 1= .
>=3B >=3B
>=3B >=3B
>=3B >=3B On Tue=2C Jan 28=2C = 2014 at 5:00 PM=2C Keith Wiley <=3Bkwiley@keithwiley.com>=3B wrote:
= >=3B >=3B I'm running a program which in the streaming layer automatica= lly multithreads and does so by automatically detecting the number of cores= on the machine. I realize this model is somewhat in conflict with Hadoop= =2C but nonetheless=2C that's what I'm doing. Thus=2C for even resource ut= ilization=2C it would be nice to not only assign one mapper per core=2C but= only one mapper per machine. I realize that if I saturate the cluster non= e of this really matters=2C but consider the following example for clarity:= 4-core nodes=2C 10-node cluster=2C thus 40 slots=2C fully configured acros= s mappers and reducers (40 slots of each). Say I run this program with jus= t two mappers. It would run much more efficiently (in essentially half the= time) if I could force the two mappers to go to slots on two separate mach= ines instead of running the risk that Hadoop may assign them both to the sa= me machine.
>=3B >=3B
>=3B >=3B Can this be done?
>=3B = >=3B
>=3B >=3B Thanks.
>=3B
>=3B _____________________= ___________________________________________________________
>=3B Keith= Wiley kwiley@keithwiley.com keithwiley.com music.keithwiley.com=
>=3B
>=3B "Luminous beings are we=2C not this crude matter.">=3B -- Yoda
>=3B _____= ___________________________________________________________________________=
>=3B
= --_4bbae670-9632-4b91-9c26-3f64249a0a81_--