Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 431BBD481 for ; Tue, 2 Oct 2012 17:04:23 +0000 (UTC) Received: (qmail 58851 invoked by uid 500); 2 Oct 2012 17:04:18 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 58510 invoked by uid 500); 2 Oct 2012 17:04:18 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58503 invoked by uid 99); 2 Oct 2012 17:04:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2012 17:04:18 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bejoy.hadoop@gmail.com designates 209.85.223.176 as permitted sender) Received: from [209.85.223.176] (HELO mail-ie0-f176.google.com) (209.85.223.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2012 17:04:10 +0000 Received: by ieak11 with SMTP id k11so18312694iea.35 for ; Tue, 02 Oct 2012 10:03:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=tIBnzes39Whf/7SxGjnGm+76Gr4FX6t+ebbBGZIuGH4=; b=xsFnPq7FqO89XPpNbtS96tC4n2ulR5AtzN+HQTlFF54J/+fT0Jslwh5pACi66jq1tT G7sigQXaKwZdvyoWZLthB8r2aZ45iDST1CST4XC2/16TV9zJvfP090nDCkaNOT+jGIWK Xjn/5K1hiHgLRteB6XQw5DnPvJlHWDLHIH5Obb9UurTnjOAhEMK3CrCMKNfkk/kPN0Uc J5hYRj7sQyHowCNsRfoh5ITrj1NDhL8yt/wXGjIxQmXcVIhso0wMmo1cWOmQBfSJpE1C YwuH7xZSeiEHu20l+f+PxUjd254K1kSBTlXWn4+iSWeLyCzxDS66t3N6nFo65Ld9G9Dk QVUQ== MIME-Version: 1.0 Received: by 10.42.32.74 with SMTP id c10mr5240208icd.36.1349197428749; Tue, 02 Oct 2012 10:03:48 -0700 (PDT) Received: by 10.64.138.104 with HTTP; Tue, 2 Oct 2012 10:03:48 -0700 (PDT) In-Reply-To: References: <1349195676.38555.YahooMailNeo@web125301.mail.ne1.yahoo.com> Date: Tue, 2 Oct 2012 22:33:48 +0530 Message-ID: Subject: Re: How to lower the total number of map tasks From: Bejoy Ks To: user@hadoop.apache.org, Shing Hing Man Content-Type: multipart/alternative; boundary=bcaec5186fea78f44304cb1681fc --bcaec5186fea78f44304cb1681fc Content-Type: text/plain; charset=ISO-8859-1 Sorry for the typo, the property name is mapred.max.split.size Also just for changing the number of map tasks you don't need to modify the hdfs block size. On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks wrote: > Hi > > You need to alter the value of mapred.max.split size to a value larger > than your block size to have less number of map tasks than the default. > > > On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man wrote: > >> >> >> >> I am running Hadoop 1.0.3 in Pseudo distributed mode. >> When I submit a map/reduce job to process a file of size about 16 GB, >> in job.xml, I have the following >> >> >> mapred.map.tasks =242 >> mapred.min.split.size =0 >> dfs.block.size = 67108864 >> >> >> I would like to reduce mapred.map.tasks to see if it improves >> performance. >> I have tried doubling the size of dfs.block.size. But >> the mapred.map.tasks remains unchanged. >> Is there a way to reduce mapred.map.tasks ? >> >> >> Thanks in advance for any assistance ! >> Shing >> >> > --bcaec5186fea78f44304cb1681fc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Sorry for the typo, the property name is=A0mapred.max.split.size

Also just for=A0changing=A0the number of map tasks you don't n= eed to modify the hdfs block size.

On Tue= , Oct 2, 2012 at 10:31 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:
Hi

You need to alter the = value of mapred.max.split size to a value larger than your block size to ha= ve less number of map tasks than the default.


On Tue, Oct 2, 2012 at= 10:04 PM, Shing Hing Man <matmsh@yahoo.com> wrote:



I am running Hadoop 1.0.3 in Pseudo=A0=A0distributed mode.
When I=A0=A0submit a map/reduce job to process a file of=A0=A0size about 16= GB, in job.xml, I have the following


mapred.map.tasks =3D242
mapred.min.split.size =3D0
dfs.block.size =3D 67108864


I would like to reduce=A0=A0 mapred.map.tasks to see if it improves perform= ance.
I have tried doubling=A0=A0the size of=A0=A0dfs.block.size. But the=A0=A0= =A0=A0mapred.map.tasks remains unchanged.
Is there a way to reduce=A0=A0mapred.map.tasks=A0=A0?


Thanks in advance for any assistance ! =A0
Shing



--bcaec5186fea78f44304cb1681fc--