Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 54284 invoked from network); 20 Apr 2009 15:27:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Apr 2009 15:27:13 -0000 Received: (qmail 3057 invoked by uid 500); 20 Apr 2009 15:27:10 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 2957 invoked by uid 500); 20 Apr 2009 15:27:10 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 2944 invoked by uid 99); 20 Apr 2009 15:27:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2009 15:27:10 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2009 15:26:59 +0000 Received: from [192.168.2.13] (snvvpn1-10-72-72-c43.hq.corp.yahoo.com [10.72.72.43]) by mrout1-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n3KFOl9P090625 for ; Mon, 20 Apr 2009 08:24:48 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:from:to:in-reply-to:content-type:mime-version: subject:date:references:x-mailer; b=H+mFwU/eyWhglxdO4zKPm2PdK0fLJ8b24BcFosC0D3ahjSGStQRWvrC+rfjb4Ai1 Message-Id: From: Arun C Murthy To: core-user@hadoop.apache.org In-Reply-To: Content-Type: multipart/alternative; boundary=Apple-Mail-1--59118047 Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Performance question Date: Mon, 20 Apr 2009 20:54:46 +0530 References: X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-1--59118047 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable On Apr 20, 2009, at 9:56 AM, Mark Kerzner wrote: > Hi, > > I ran a Hadoop MapReduce task in the local mode, reading and writing =20= > from > HDFS, and it took 2.5 minutes. Essentially the same operations on =20 > the local > file system without MapReduce took 1/2 minute. Is this to be =20 > expected? > Hmm... some overhead is expected, but this seems too much. What =20 version of Hadoop are your running? It's hard to help without more details about your application, =20 configuration etc., I'll try... > It seemed that the system lost most of the time in the MapReduce =20 > operation, > such as after these messages > > 09/04/19 23:23:01 INFO mapred.LocalJobRunner: reduce > reduce > 09/04/19 23:23:01 INFO mapred.JobClient: map 100% reduce 92% > 09/04/19 23:23:04 INFO mapred.LocalJobRunner: reduce > reduce > > it waited for a long time. The final output lines were > It could either be the reduce-side merge or the hdfs-write. Can you =20 check your task-logs and data-node logs? > 09/04/19 23:24:13 INFO mapred.JobClient: Combine input records=3D185= > 09/04/19 23:24:13 INFO mapred.JobClient: Combine output =20 > records=3D185 That shows that the combiner is useless for this app, turn it off - it =20= adds unnecessary overhead. > 09/04/19 23:24:13 INFO mapred.JobClient: File Systems > 09/04/19 23:24:13 INFO mapred.JobClient: HDFS bytes read=3D138103444= > 09/04/19 23:24:13 INFO mapred.JobClient: HDFS bytes =20 > written=3D107357785 > 09/04/19 23:24:13 INFO mapred.JobClient: Local bytes =20 > read=3D282509133 > 09/04/19 23:24:13 INFO mapred.JobClient: Local bytes =20 > written=3D376697552 For the amount of data you are processing, you are doing far too much =20= local-disk i/o. 'Local bytes written' should be _very_ close to the 'Map output bytes' =20= i.e 91M for 'maps' and zero for reduces. (See the counters-table on =20 the job-details web-ui.) There are a few knobs you need to tweak to get closer to more optimal =20= performance, the good news is that it's doable - the bad news is that =20= one _has_ to get his/her fingers dirty... Some knobs you will be interested in are: Map-side: =95io.sort.mb =95io.sort.factor =95io.sort.record.percent =95io.sort.spill.percent Reduce-side * mapred.reduce.parallel.copies * mapred.reduce.copy.backoff * mapred.job.shuffle.input.buffer.percent * mapred.job.shuffle.merge.percent * mapred.inmem.merge.threshold * mapred.job.reduce.input.buffer.percent Check description for each of them in hadoop-default.xml or mapred-=20 default.xml (depending on the version of Hadoop you are running). Some more details available here: = http://wiki.apache.org/hadoop-data/attachments/HadoopPresentations/attachm= ents/TuningAndDebuggingMapReduce_ApacheConEU09.pdf hth, Arun --Apple-Mail-1--59118047--