Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 87268 invoked from network); 17 Nov 2010 09:23:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Nov 2010 09:23:22 -0000 Received: (qmail 78370 invoked by uid 500); 17 Nov 2010 09:23:51 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 78098 invoked by uid 500); 17 Nov 2010 09:23:47 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 78090 invoked by uid 99); 17 Nov 2010 09:23:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Nov 2010 09:23:47 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hsreekumar@clickable.com designates 74.125.149.197 as permitted sender) Received: from [74.125.149.197] (HELO na3sys009aog107.obsmtp.com) (74.125.149.197) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 17 Nov 2010 09:23:38 +0000 Received: from source ([74.125.82.181]) by na3sys009aob107.postini.com ([74.125.148.12]) with SMTP ID DSNKTOOfBIKXi81RFFNiPKfkoYOZf+H6oHhx@postini.com; Wed, 17 Nov 2010 01:23:17 PST Received: by wyb40 with SMTP id 40so1777077wyb.26 for ; Wed, 17 Nov 2010 01:23:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.227.68.201 with SMTP id w9mr9025666wbi.59.1289985792097; Wed, 17 Nov 2010 01:23:12 -0800 (PST) Received: by 10.227.140.231 with HTTP; Wed, 17 Nov 2010 01:23:12 -0800 (PST) In-Reply-To: References: Date: Wed, 17 Nov 2010 14:53:12 +0530 Message-ID: Subject: Re: program running faster on single node than cluster From: Hari Sreekumar To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e659f924e73d1c04953c38f0 X-Virus-Checked: Checked by ClamAV on apache.org --0016e659f924e73d1c04953c38f0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Are all the nodes being used? Go to :50030 on the web interface after starting the job, and check whether the tasks are progressing togethe= r on all nodes or not. hari On Wed, Nov 17, 2010 at 9:14 AM, Cornelio I=F1igo wrote: > Hi > > I have a question to you: > > I developed a program using Hadoop, it has one map function and one reduc= e > function (like WordCount) and in the map function I do all the process of > my > data > when I run this program in a single node machine it takes like 7 minutes > (its a small dataset), in a pseudo-distributed machine takes like 7 minut= es > too, but when I run it on a > full distributed cluster (12 nodes) it takes much longer, like an hour!! > > I tried changing the mapred.tasktracker.map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum variables (2 and 2 like default, = 10 > and 2, 2 and 10, 5 and 5) and the results are the same > Am I missing something? > Is this a cluster configuration issue or is in my program? > > Thanks > > -- > *Cornelio* > --0016e659f924e73d1c04953c38f0--