Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 63955 invoked from network); 5 Mar 2009 18:22:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Mar 2009 18:22:56 -0000 Received: (qmail 49110 invoked by uid 500); 5 Mar 2009 18:22:49 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 49057 invoked by uid 500); 5 Mar 2009 18:22:49 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 49046 invoked by uid 99); 5 Mar 2009 18:22:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Mar 2009 10:22:49 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of snickerdoodle08@gmail.com designates 74.125.44.30 as permitted sender) Received: from [74.125.44.30] (HELO yx-out-2324.google.com) (74.125.44.30) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Mar 2009 18:22:43 +0000 Received: by yx-out-2324.google.com with SMTP id 8so36596yxb.29 for ; Thu, 05 Mar 2009 10:22:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=eiFYTSJdtLPXWssYot+BWU4rp9sAv8WgzsQGuR+mH9E=; b=whw7MOMBzlEyiMblcHhxSgccTqsnbsDjjOB6rbHE4jIj6f1wjhA2P769e/CP0eR+JI 7zNH/jNGoSJZ4pWIydlOw3k68XNLIj4OTO/kDbhEn3s7gWjkC5QJ19i6LPR0poMUgEv2 1Clk/tfpHSXQzoG88iDTm6LwTwLNf5GDutJjQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=ghs1wnyd6gLMryyccpWhBeHhtAHf+YNSxZwxwT/lmIe7Lnr6huR4nsHkq43EeTFEJg uMaaz/ieHMFYXiYChzycKEpCWONduDU9n2Q1f5wObkJGzNyxeivh4KDCFtPOjbeSBbzQ JUJs3F8QbaUrU1Rh1CaO1WmTbHT7VYE4sI4ZQ= MIME-Version: 1.0 Received: by 10.151.154.20 with SMTP id g20mr2696414ybo.154.1236277342202; Thu, 05 Mar 2009 10:22:22 -0800 (PST) In-Reply-To: References: <257c70550903041446w6d456184gdf14323bbef60783@mail.gmail.com> Date: Thu, 5 Mar 2009 12:22:22 -0600 Message-ID: <257c70550903051022i496c9004qf156be53098d5b61@mail.gmail.com> Subject: Re: wordcount getting slower with more mappers and reducers? From: Sandy To: core-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001517511b20d36bab0464633fd2 X-Virus-Checked: Checked by ClamAV on apache.org --001517511b20d36bab0464633fd2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Arun, How can I check the number of slots per tasktracker? Which parameter controls that? Thanks, -SM On Thu, Mar 5, 2009 at 12:14 PM, Arun C Murthy wrote: > I assume you have only 2 map and 2 reduce slots per tasktracker - which > totals to 2 maps/reduces for you cluster. This means with more maps/reduces > they are serialized to 2 at a time. > > Also, the -m is only a hint to the JobTracker, you might see less/more than > the number of maps you have specified on the command line. > The -r however is followed faithfully. > > Arun > > > On Mar 4, 2009, at 2:46 PM, Sandy wrote: > > Hello all, >> >> For the sake of benchmarking, I ran the standard hadoop wordcount example >> on >> an input file using 2, 4, and 8 mappers and reducers for my job. >> In other words, I do: >> >> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 2 -r 2 >> sample.txt output >> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 4 -r 4 >> sample.txt output2 >> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 8 -r 8 >> sample.txt output3 >> >> Strangely enough, when this increase in mappers and reducers result in >> slower running times! >> -On 2 mappers and reducers it ran for 40 seconds >> on 4 mappers and reducers it ran for 60 seconds >> on 8 mappers and reducers it ran for 90 seconds! >> >> Please note that the "sample.txt" file is identical in each of these runs. >> >> I have the following questions: >> - Shouldn't wordcount get -faster- with additional mappers and reducers, >> instead of slower? >> - If it does get faster for other people, why does it become slower for >> me? >> I am running hadoop on psuedo-distributed mode on a single 64-bit Mac Pro >> with 2 quad-core processors, 16 GB of RAM and 4 1TB HDs >> >> I would greatly appreciate it if someone could explain this behavior to >> me, >> and tell me if I'm running this wrong. How can I change my settings (if at >> all) to get wordcount running faster when i increases that number of maps >> and reduces? >> >> Thanks, >> -SM >> > > --001517511b20d36bab0464633fd2--