Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14731DB66 for ; Fri, 1 Mar 2013 04:16:16 +0000 (UTC) Received: (qmail 9004 invoked by uid 500); 1 Mar 2013 04:16:11 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 8753 invoked by uid 500); 1 Mar 2013 04:16:10 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 8745 invoked by uid 99); 1 Mar 2013 04:16:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 04:16:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.219.41 as permitted sender) Received: from [209.85.219.41] (HELO mail-oa0-f41.google.com) (209.85.219.41) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 04:16:04 +0000 Received: by mail-oa0-f41.google.com with SMTP id i10so5084348oag.0 for ; Thu, 28 Feb 2013 20:15:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=zs5p+j59wpSy9xxxyHN8ESVvOM2FVHgRw4QYC9MzO+E=; b=LdqlF1bLgSRvVDteryd2Nv11klkKx4ZECH9QBGQ7deOwE3Y65q1tqIU0A356lsOl4x 8dLtLaRlHmYfrkfrC/V1pPtqmrHtEf8vp8syQrEXH81Jxej1BLk9+4XwI1ceE2xSlTGe Z4iAlHy9pDlWfxHHFIkHUwZUTwbG1ezTcAIBqsuX69A7yTEvuOyGRgdiw5g6Kw271kzx 19d0Mb8CCtM36BDkn+r9gY4aEGiT1G7+IjH/CGjD41YJhLApd5Bx4m2QR8Vx+jVLfs0i T0WgftTnG6cXWkaf7kAWgGeEEZ/nOmHrRKvUkOmmDSPwUii7XrvcJfbO8lD6aVSVHn4A g4FA== X-Received: by 10.182.39.69 with SMTP id n5mr7615121obk.72.1362111343229; Thu, 28 Feb 2013 20:15:43 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.27.10 with HTTP; Thu, 28 Feb 2013 20:15:23 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Fri, 1 Mar 2013 09:45:23 +0530 Message-ID: Subject: Re: How to make a MapReduce job with no input? To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQl2AQ+GgHaGyox6EsYJVulmHyR8iK92xkhtnGB5LXOJ8WG78g/WxfaBiSYmUdLWoiBSen8t X-Virus-Checked: Checked by ClamAV on apache.org The default # of map tasks is set to 2 (via mapred.map.tasks from mapred-default.xml) - which explains your 2-map run for even one line of text. For running with no inputs, take a look at Sleep Job's EmptySplits technique on trunk: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/SleepJob.java?view=markup (~line 70) On Fri, Mar 1, 2013 at 2:46 AM, Mike Spreitzer wrote: > I am using the mapred API of Hadoop 1.0. I want to make a job that does not > really depend on any input (the job conf supplies all the info needed in > Mapper). What is a good way to do this? > > What I have done so far is write a job in which MyMapper.configure(..) reads > all the real input from the JobConf, and MyMapper.map(..) ignores the given > key and value, writing the output implied by the JobConf. I set the > InputFormat to TextInputFormat and the input paths to be a list of one > filename; the named file contains one line of text (the word "one"), > terminated by a newline. When I run this job (on Linux, hadoop-1.0.0), I > find it has two map tasks --- one reads the first two bytes of my non-input > file, and other reads the last two bytes of my non-input file! How can I > make a job with just one map task? > > Thanks, > Mike -- Harsh J