Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 89368 invoked from network); 29 Aug 2008 15:17:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Aug 2008 15:17:43 -0000 Received: (qmail 85886 invoked by uid 500); 29 Aug 2008 15:17:37 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 85550 invoked by uid 500); 29 Aug 2008 15:17:36 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 85539 invoked by uid 99); 29 Aug 2008 15:17:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Aug 2008 08:17:36 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [158.130.70.79] (HELO stag.seas.upenn.edu) (158.130.70.79) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Aug 2008 15:16:38 +0000 Received: from [192.168.9.132] (cpe-67-9-133-138.austin.res.rr.com [67.9.133.138]) (authenticated bits=0) by stag.seas.upenn.edu (8.13.6/8.13.6) with ESMTP id m7TFDvLh011230 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Fri, 29 Aug 2008 11:13:58 -0400 In-Reply-To: <48B6A227.1020800@in.tum.de> References: <48B6A227.1020800@in.tum.de> Mime-Version: 1.0 (Apple Message framework v753.1) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <0EFDE18E-65C9-4CB4-809E-B0F5DBB889A5@cis.upenn.edu> Cc: Benjamin Gufler Content-Transfer-Encoding: 7bit From: Shirley Cohen Subject: Re: MultipleOutputFormat versus MultipleOutputs Date: Fri, 29 Aug 2008 10:13:49 -0500 To: core-user@hadoop.apache.org X-Mailer: Apple Mail (2.753.1) X-Virus-Checked: Checked by ClamAV on apache.org Thanks, Benjamin. Your example saved me a lot of time :)) Shirley On Aug 28, 2008, at 8:03 AM, Benjamin Gufler wrote: > Hi Shirley, > > On 2008-08-28 14:32, Shirley Cohen wrote: >> Do you have an example that shows how to use MultipleOutputFormat? > > using MultipleOutputFormat is actually pretty easy. Derive a class > from > it, overriding - if you want to base the destination file name on the > key and/or value - the method "generateFileNameForKeyValue". I'm using > it this way: > > protected String generateFileNameForKeyValue(K key, V value, > String name) { > return name + "-" + key.toString(); > } > > Pay attention at not generating too many different file names, > however: > All the files are kept open until the Reducer terminates, and > operating > systems usually impose a limit on open files you can have. > > Also, if you haven't done so yet, please upgrade to the latest > release, > 0.18, if you want to use MultipleOutputFormat. Up to 0.17.2, there was > some trouble with Reducers having more than one output file (see > HADOOP-3639 for the details). > > Benjamin