Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 45386 invoked from network); 18 Nov 2010 16:26:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Nov 2010 16:26:59 -0000 Received: (qmail 35678 invoked by uid 500); 18 Nov 2010 16:27:28 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 35549 invoked by uid 500); 18 Nov 2010 16:27:25 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 35541 invoked by uid 99); 18 Nov 2010 16:27:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Nov 2010 16:27:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com designates 65.55.34.212 as permitted sender) Received: from [65.55.34.212] (HELO col0-omc4-s10.col0.hotmail.com) (65.55.34.212) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Nov 2010 16:27:17 +0000 Received: from COL117-W2 ([65.55.34.200]) by col0-omc4-s10.col0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 18 Nov 2010 08:26:56 -0800 Message-ID: Content-Type: multipart/alternative; boundary="_a608daa8-6126-4d58-81d3-c1c077ecad8d_" X-Originating-IP: [65.167.11.254] From: Michael Segel To: Subject: RE: program running faster on single node than cluster Date: Thu, 18 Nov 2010 10:26:55 -0600 Importance: Normal In-Reply-To: References: , MIME-Version: 1.0 X-OriginalArrivalTime: 18 Nov 2010 16:26:56.0724 (UTC) FILETIME=[6AF2C140:01CB873D] --_a608daa8-6126-4d58-81d3-c1c077ecad8d_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Well... There could be a couple of reasons for this... If you have a small data set=2C like it fits in to a single block=2C if you= have 12 nodes each w 10 mappers=2C that's 120 potential mappers going agai= nst the same block=2C right? Assuming that you're splitting the file in to 120 pieces. Tuning: How big are your nodes? How many cores and how much memory? While some tune to the number of virtual cores=2C I tend to be conservative= and tune to the actual number of cores. The reason is that you have to consider how much memory you have on the box= . Since we run Hbase along with TT and DN=2C memory gets used pretty quickl= y. If your nodes are swapping... that could hurt too. Just an example... suppose you have 8 GB on a 8 core box. You set up TT and= DN each with 1GB. So you have 6GB left. Assuming 1GB per mapper/reducer=2C= you're looking at 4mapper/2reducer per node. Again=2C YMMV (Your mileage = may vary) and these are just a rough example. Its always safer to start wit= h a lower number and monitor your system via Ganglia to see how to tune up. (Cause you really=2C really=2C ... really don't want to swap. All IMHO -Mike > Date: Wed=2C 17 Nov 2010 14:53:12 +0530 > Subject: Re: program running faster on single node than cluster > From: hsreekumar@clickable.com > To: common-user@hadoop.apache.org >=20 > Are all the nodes being used? Go to :50030 on the web interface > after starting the job=2C and check whether the tasks are progressing tog= ether > on all nodes or not. >=20 > hari >=20 > On Wed=2C Nov 17=2C 2010 at 9:14 AM=2C Cornelio I=F1igo > wrote: >=20 > > Hi > > > > I have a question to you: > > > > I developed a program using Hadoop=2C it has one map function and one r= educe > > function (like WordCount) and in the map function I do all the process = of > > my > > data > > when I run this program in a single node machine it takes like 7 minute= s > > (its a small dataset)=2C in a pseudo-distributed machine takes like 7 m= inutes > > too=2C but when I run it on a > > full distributed cluster (12 nodes) it takes much longer=2C like an hou= r!! > > > > I tried changing the mapred.tasktracker.map.tasks.maximum and > > mapred.tasktracker.reduce.tasks.maximum variables (2 and 2 like default= =2C 10 > > and 2=2C 2 and 10=2C 5 and 5) and the results are the same > > Am I missing something? > > Is this a cluster configuration issue or is in my program? > > > > Thanks > > > > -- > > *Cornelio* > > = --_a608daa8-6126-4d58-81d3-c1c077ecad8d_--