Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B4E6D301 for ; Wed, 26 Dec 2012 18:42:35 +0000 (UTC) Received: (qmail 57127 invoked by uid 500); 26 Dec 2012 18:42:30 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 56965 invoked by uid 500); 26 Dec 2012 18:42:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 56958 invoked by uid 99); 26 Dec 2012 18:42:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 18:42:30 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anilgupta84@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ea0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 18:42:25 +0000 Received: by mail-ea0-f176.google.com with SMTP id d13so3531860eaa.7 for ; Wed, 26 Dec 2012 10:42:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=YSSe26SbRMHnwS+4t8aoEfU5bZQP1FCs9Y/DRwz5hMc=; b=B0k3ISIouDLw6qM5ICBgjnlxquNJrHfDLtUUTJDXC+1k8/ajcN9X0PqXHroDdwcLB9 vgSFAjPyYNydaQ3XbU6y1KW4xfKPYnyYrQBRAXQYul6azkEWXzEok2ucW/zm6aLLySug fD/3jfS0tl8S4jrSBx6TuP1+1LeVRxSWyMOGnIC6OPJEUb2mbjTRXdlRd3JZxKFruWzj 7PgkNkrr/e+QtnjpLX03J72MGppUlCZe3c0QWh/IldfCBjnWhau2dkP8/CDj8g1t4vaE CBnYTBIl36B8ORZXFeCOvoL8nkPah5+ZRbSqjHJqimDZRi08cBB6qywOp1g/yFhgOeq2 +EnA== Received: by 10.14.214.132 with SMTP id c4mr71650812eep.18.1356547324011; Wed, 26 Dec 2012 10:42:04 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.159.135 with HTTP; Wed, 26 Dec 2012 10:41:43 -0800 (PST) In-Reply-To: References: From: anil gupta Date: Wed, 26 Dec 2012 10:41:43 -0800 Message-ID: Subject: Re: Setting number of mappers in Teragen To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b621df85e638c04d1c5c99d X-Virus-Checked: Checked by ClamAV on apache.org --047d7b621df85e638c04d1c5c99d Content-Type: text/plain; charset=ISO-8859-1 Hi Harsh, Fixed it. I was putting the -Dmapred.map.tasks=20 after specifying the input directory. I completely forgot about this trick of genericOptionParser of Hadoop. Thanks a lot. :) On Wed, Dec 26, 2012 at 10:33 AM, Harsh J wrote: > The MR1 teragen's mappers # depends on the total number of rows and > demanded # of maps. > > How are you passing -Dmapred.map.tasks=20 (no spaces) exactly? All > generic options must go in before any other options do, so it should > appear right after the word "teragen" in your command. > > On Wed, Dec 26, 2012 at 11:49 PM, anil gupta > wrote: > > Hi All, > > > > I have 5 worker nodes and i have 4 map slots per node. So, i have 20 map > > slots in my cluster. But when, i start my Teragen job, it only spawns 2 > > mappers for entire job. I have even tried using the option > > -Dmapred.map.tasks = 20 . Can anyone tell me how to force teragen to use > 20 > > mappers for generating the data? I am using cdh4.1.2 with > Mapreducev1(Hadoop > > 0.20.2) > > -- > > Thanks & Regards, > > Anil Gupta > > > > -- > Harsh J > -- Thanks & Regards, Anil Gupta --047d7b621df85e638c04d1c5c99d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Harsh,

Fixed it. I was putting the -Dmapred.map.tasks=3D20 after = specifying the input directory. I completely forgot about this trick of gen= ericOptionParser of Hadoop. Thanks a lot. :)

On Wed, Dec 26, 2012 at 10:33 AM, Harsh J <harsh@cloudera.com> wrote:
The MR1 teragen's mappers # depends on the total number of rows and
demanded # of maps.

How are you passing -Dmapred.map.tasks=3D20 (no spaces) exactly? All
generic options must go in before any other options do, so it should
appear right after the word "teragen" in your command.

On Wed, Dec 26, 2012 at 11:49 PM, anil gupta <anilgupta84@gmail.com> wrote:
> Hi All,
>
> I have 5 worker nodes and i have 4 map slots per node. So, i have 20 m= ap
> slots in my cluster. But when, i start my Teragen job, it only spawns = 2
> mappers for entire job. I have even tried using the option
> -Dmapred.map.tasks =3D 20 . Can anyone tell me how to force teragen to= use 20
> mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(= Hadoop
> 0.20.2)
> --
> Thanks & Regards,
> Anil Gupta



--
Harsh J



--
Thanks &a= mp; Regards,
Anil Gupta --047d7b621df85e638c04d1c5c99d--