Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 38681 invoked from network); 26 Apr 2010 18:40:06 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Apr 2010 18:40:06 -0000 Received: (qmail 10508 invoked by uid 500); 26 Apr 2010 18:40:04 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 10453 invoked by uid 500); 26 Apr 2010 18:40:04 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 10445 invoked by uid 99); 26 Apr 2010 18:40:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 18:40:04 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [65.55.88.15] (HELO TX2EHSOBE010.bigfish.com) (65.55.88.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 18:39:56 +0000 Received: from mail79-tx2-R.bigfish.com (10.9.14.247) by TX2EHSOBE010.bigfish.com (10.9.40.30) with Microsoft SMTP Server id 8.1.340.0; Mon, 26 Apr 2010 18:39:33 +0000 Received: from mail79-tx2 (localhost.localdomain [127.0.0.1]) by mail79-tx2-R.bigfish.com (Postfix) with ESMTP id B95BB120028D for ; Mon, 26 Apr 2010 18:39:33 +0000 (UTC) X-SpamScore: -32 X-BigFish: VPS-32(zz542N1432P98dN9f18L9371Pzz1202hzz6ff19hz2dh2a8h43h61h) X-Spam-TCS-SCL: 0:0 Received: from mail79-tx2 (localhost.localdomain [127.0.0.1]) by mail79-tx2 (MessageSwitch) id 1272307173484412_3666; Mon, 26 Apr 2010 18:39:33 +0000 (UTC) Received: from TX2EHSMHS038.bigfish.com (unknown [10.9.14.251]) by mail79-tx2.bigfish.com (Postfix) with ESMTP id 10EFE4E0050 for ; Mon, 26 Apr 2010 18:39:33 +0000 (UTC) Received: from fegcnmsmmdz04 (216.205.251.27) by TX2EHSMHS038.bigfish.com (10.9.99.138) with Microsoft SMTP Server (TLS) id 14.0.482.44; Mon, 26 Apr 2010 18:39:32 +0000 Received: from ffeplmsexbh04.ffe.foxeg.com (Not Verified[10.136.96.105]) by FOX.COM (post.office MTA v5.0 0924 ) with ESMTP id ; Mon, 26 Apr 2010 11:39:30 -0700 Received: from fegcnmsexmb07.ffe.foxeg.com ([10.137.36.177]) by ffeplmsexbh04.ffe.foxeg.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 26 Apr 2010 11:39:29 -0700 x-mimeole: Produced By Microsoft Exchange V6.5 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Chaining M/R Jobs Date: Mon, 26 Apr 2010 11:39:27 -0700 Message-ID: <1CE1A36174C8CD41B8B261BBD3C66DC001769CBE@fegcnmsexmb07.ffe.foxeg.com> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Chaining M/R Jobs Thread-Index: AcrlbwNyzWbHjsyBTmO+EH5UfpdnYQAAHZXg References: From: Xavier Stevens To: X-OriginalArrivalTime: 26 Apr 2010 18:39:29.0121 (UTC) FILETIME=[CDD9BD10:01CAE56F] X-Reverse-DNS: unknown X-Virus-Checked: Checked by ClamAV on apache.org I don't usually bother renaming the files. If you know you want all of the files, you just iterate over the files in the output directory from the previous job. And then add those to distributed cache. If the data is fairly small you can set the number of reducers to 1 on the previous step as well. -Xavier -----Original Message----- From: Eric Sammer [mailto:esammer@cloudera.com]=20 Sent: Monday, April 26, 2010 11:33 AM To: common-user@hadoop.apache.org Subject: Re: Chaining M/R Jobs The easiest way to do this is to write your job outputs to a known place and then use the FileSystem APIs to rename the part-* files to what you want them to be. On Mon, Apr 26, 2010 at 2:22 PM, Tiago Veloso wrote: > Hi, > > I'm trying to find a way to control the output file names. I need this because I have a situation where I need to run a Job and then use it's output in the DistributedCache. > > So far the only way I've seen that makes it possible is rewriting the OutputFormat class but that seems a lot of work for such a simple task. Is there any way to do what I'm looking for? > > Tiago Veloso > ti.veloso@gmail.com > > > > --=20 Eric Sammer phone: +1-917-287-2675 twitter: esammer data: www.cloudera.com