Return-Path: Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: (qmail 69148 invoked from network); 15 Apr 2011 20:21:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Apr 2011 20:21:09 -0000 Received: (qmail 8576 invoked by uid 500); 15 Apr 2011 20:21:08 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 8542 invoked by uid 500); 15 Apr 2011 20:21:08 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 8534 invoked by uid 99); 15 Apr 2011 20:21:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 20:21:08 +0000 X-ASF-Spam-Status: No, hits=3.3 required=5.0 tests=HTML_MESSAGE,NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 20:21:00 +0000 Received: from sp1-ex07cas01.ds.corp.yahoo.com (sp1-ex07cas01.ds.corp.yahoo.com [216.252.116.137]) by mrout2-b.corp.re1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id p3FKKIek095588 for ; Fri, 15 Apr 2011 13:20:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=yahoo-inc.com; s=cobra; t=1302898818; bh=L9QHqpR0UcbYsWNYcGzm6JgeXg4OXBoowHnUQYbMKes=; h=From:To:Date:Subject:Message-ID:In-Reply-To:Content-Type: MIME-Version; b=wUL39UCApSC4mpSoopQo6kz/dAkNZLxrZBAzOPZ+jxIvfjZRssgGBbO9OQjqVMOAn a80SXsGoXPCvaQxwoWgkGOvBO72F4mHnkIP1iuCg4h98mIelLGThHbNz2I1SBzSIIX Iq7EKJSnKD0hS6cCuxX2/A/U1ZKMzUiMVOyJOc34= Received: from SP1-EX07VS02.ds.corp.yahoo.com ([216.252.116.135]) by sp1-ex07cas01.ds.corp.yahoo.com ([216.252.116.137]) with mapi; Fri, 15 Apr 2011 13:20:18 -0700 From: Robert Evans To: "mapreduce-user@hadoop.apache.org" Date: Fri, 15 Apr 2011 13:20:14 -0700 Subject: Re: successive mappers Thread-Topic: successive mappers Thread-Index: Acv7oEun4GLY/zQ2QSWkXO9cT16OcwACjt1u Message-ID: In-Reply-To: <898077.11584.qm@web19206.mail.hk2.yahoo.com> Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C9CE12AE21E19evansyahooinccom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_C9CE12AE21E19evansyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I, Take a look at the Multiple output format classes http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapred/l= ib/MultipleTextOutputFormat.html Is a good example. You should be able to create a custom output format cla= ss that matches your needs. Although, if all you are doing is map processi= ng then why are you outputting intermediate results instead of processing t= hem all in a single mapper? It should be a lot faster if you don't need th= e intermediate results. --Bobby Evans On 4/15/11 2:05 PM, "Injun Joe" wrote: Hi, I am coding a map-reduce program which involves several map-reduce steps. T= he work that my program does is only in the mapper, so I was thinking to ha= ve no reduce steps but successive mappers. The logic can be written like th= is for mappers at iteration 0 and 1: 1. Take input. 2. Map 0: Determine if a key-value pair satisfies condition C. - If it satisfies condition then output the key-value pair to a file in= directory E. - If it does not then transform key-value pair and output the key-value= pair to directory D. 3. Map 1: - Change input directory to directory D - Perform same steps as map 0. So, the problem is that I have not been able to find a way to output key-va= lue pairs to different directories. All I have been able to specify is the = map output directory by TextOutputFormat.setOutputPath. Any help would be appreciated. Thanks a lot I --_000_C9CE12AE21E19evansyahooinccom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: successive mappers I,

Take a look at the Multiple output format classes

http://hadoop.apache.org/commo= n/docs/r0.20.0/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.ht= ml

Is a good example.  You should be able to create a custom output forma= t class that matches your needs.  Although, if all you are doing is ma= p processing then why are you outputting intermediate results instead of pr= ocessing them all in a single mapper?  It should be a lot faster if yo= u don’t need the intermediate results.

--Bobby Evans

On 4/15/11 2:05 PM, "Injun Joe" <ll_oz_ll@yahoo.com.hk> wrote:

Hi,
I am coding a map-reduce program which involves several map-reduce steps. T= he work that my program does is only in the mapper, so I was thinking to ha= ve no reduce steps but successive mappers. The logic can be written like th= is for mappers at iteration 0 and 1:

1. Take input.
2. Map 0:
   Determine if a key-value pair satisfies condition C.
    - If it satisfies condition then output the key-val= ue pair to a file in directory E.
    - If it does not then transform key-value pair and = output the key-value pair to directory D.
3. Map 1:
   - Change input directory to directory D
   - Perform same steps as map 0.

So, the problem is that I have not been able to find a way to output key-va= lue pairs to different directories. All I have been able to specify is the = map output directory by TextOutputFormat.setOutputPath.

Any help would be appreciated.

Thanks a lot
I


--_000_C9CE12AE21E19evansyahooinccom_--