Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0119897B7 for ; Fri, 1 Mar 2013 01:41:56 +0000 (UTC) Received: (qmail 55689 invoked by uid 500); 1 Mar 2013 01:41:51 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 55588 invoked by uid 500); 1 Mar 2013 01:41:51 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 55581 invoked by uid 99); 1 Mar 2013 01:41:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 01:41:51 +0000 X-ASF-Spam-Status: No, hits=4.0 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SHORTENED_URL_HREF,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jeff.kubina@gmail.com designates 209.85.216.51 as permitted sender) Received: from [209.85.216.51] (HELO mail-qa0-f51.google.com) (209.85.216.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 01:41:46 +0000 Received: by mail-qa0-f51.google.com with SMTP id cr7so1866619qab.10 for ; Thu, 28 Feb 2013 17:41:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=RfJqDdPlSE+mo4rsrSIZmfr3fbOLjIoogPNhNdeIAAQ=; b=zO/nKfb8dssoS7gxBjbh8QIXly/zU0Rb1ECiYmeBt1IS+fbJtGJW3IpUe4BsVuzlKp kdg0aKkUE6txh4OEH64gn3M2qN28Scg+ooVZExuAtKtifuqfPhVZiGM4t6qPE6jiNJwi 2LxRNrW3UabDETrCPpdsgZS8Y1qeDffpzfS5HP1kVRjHupFZovYfB160KGUH7bVhUYJ9 e8VvfO2tKYi/d5uqEmxdnijYH2ih7pAYaszp5JTgREM0yw/QLNhwWRGvcS73/DnSHLwW rdwgApNknNwm9rJYRknccNHHHQOvcTstepExv63YkIWIO6nKBLo7ahIeK1ll+xvvOBhX Vy3w== X-Received: by 10.224.185.141 with SMTP id co13mr17118371qab.33.1362102085317; Thu, 28 Feb 2013 17:41:25 -0800 (PST) MIME-Version: 1.0 Received: by 10.49.116.107 with HTTP; Thu, 28 Feb 2013 17:41:05 -0800 (PST) In-Reply-To: References: From: Jeff Kubina Date: Thu, 28 Feb 2013 20:41:05 -0500 Message-ID: Subject: Re: How to make a MapReduce job with no input? To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=485b397dd78df19b8604d6d31a32 X-Virus-Checked: Checked by ClamAV on apache.org --485b397dd78df19b8604d6d31a32 Content-Type: text/plain; charset=ISO-8859-1 Mike, To do this for the more general case of creating N map jobs with each job receiving the one record , where i ranges from 0 to n-1, I wrote an InputFormat, InputSplit, and RecordReader Hadoop class. The sample code is here . I think I wrote those for Hadoop 0.19, so they may need some tweaking for subsequent versions. Jeff On Thu, Feb 28, 2013 at 4:25 PM, Mike Spreitzer wrote: > On closer inspection, I see that of my two tasks: the first processes 1 > input record and the other processes 0 input records. So I think this > solution is correct. But perhaps it is not the most direct way to get the > job done? > > > > > From: Mike Spreitzer/Watson/IBM@IBMUS > To: user@hadoop.apache.org, > Date: 02/28/2013 04:18 PM > Subject: How to make a MapReduce job with no input? > ------------------------------ > > > > I am using the mapred API of Hadoop 1.0. I want to make a job that does > not really depend on any input (the job conf supplies all the info needed > in Mapper). What is a good way to do this? > > What I have done so far is write a job in which MyMapper.configure(..) > reads all the real input from the JobConf, and MyMapper.map(..) ignores the > given key and value, writing the output implied by the JobConf. I set the > InputFormat to TextInputFormat and the input paths to be a list of one > filename; the named file contains one line of text (the word "one"), > terminated by a newline. When I run this job (on Linux, hadoop-1.0.0), I > find it has two map tasks --- one reads the first two bytes of my non-input > file, and other reads the last two bytes of my non-input file! How can I > make a job with just one map task? > > Thanks, > Mike > --485b397dd78df19b8604d6d31a32 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Mike,

To do this for the more general case of creating N= map jobs with each job receiving the one record <i, n>, where i rang= es from 0 to n-1, I wrote an=A0InputFormat, InputSplit, and RecordReader Ha= doop class. The sample code is here. I = think I wrote those for Hadoop 0.19, so they may need some tweaking for sub= sequent versions.

Jeff

On Thu, F= eb 28, 2013 at 4:25 PM, Mike Spreitzer <mspreitz@us.ibm.com> wrote:
On closer inspecti= on, I see that of my two tasks: the first processes 1 input record and the other processes 0 input records. =A0So I think this solution is correct. =A0But perhaps it is not the most direct way to get the job done?




From: =A0 =A0 = =A0 =A0Mike Spreitzer/Watson/IBM@IB= MUS
To: =A0 =A0 =A0 =A0user@hadoop.apache.org,
Date: =A0 =A0 = =A0 =A002/28/2013 04:18 PM
Subject: =A0 =A0 =A0 =A0How to make a MapReduce job with no input?




I am using the mapred API of Hadoop 1.0. =A0I want to make a job that does not really depend on any input (the job conf supplies all the info needed in Mapper). =A0What is a good way to do this?

What I have done so far is write a job in which MyMapper.configure(..) reads all the real input from the JobConf, and MyMapper.map(..) ignores the given key and value, writing the output implied by the JobConf. =A0I set the InputFormat to TextInputFormat and the input paths to be a list of one filename; the named file contains one line of text (the word "o= ne"), terminated by a newline. =A0When I run this job (on Linux, hadoop-1.0.0), I find it has two map tasks --- one reads the first two bytes of my non-inp= ut file, and other reads the last two bytes of my non-input file! =A0How can I make a job with just one map task?


Thanks,

Mike


--485b397dd78df19b8604d6d31a32--