Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 95093 invoked from network); 31 Mar 2006 15:20:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 31 Mar 2006 15:20:12 -0000 Received: (qmail 77616 invoked by uid 500); 31 Mar 2006 15:20:11 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 77595 invoked by uid 500); 31 Mar 2006 15:20:11 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 77586 invoked by uid 99); 31 Mar 2006 15:20:11 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Mar 2006 07:20:11 -0800 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [216.109.112.27] (HELO mrout1-b.corp.dcn.yahoo.com) (216.109.112.27) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Mar 2006 07:20:09 -0800 Received: from explainfloorlx (snvvpn2-10-72-76-c20.corp.yahoo.com [10.72.76.20]) by mrout1-b.corp.dcn.yahoo.com (8.13.6/8.13.4/y.out) with ESMTP id k2VFJc5k040112 for ; Fri, 31 Mar 2006 07:19:38 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=from:to:subject:date:message-id:mime-version:content-type: content-transfer-encoding:x-mailer:in-reply-to:x-mimeole:thread-index; b=OBFXkkTQ1GYUR5nNK60fmmqx6yUJ3XLBZ7fWVaQP9lL5CMADvOV99CbIU3uiNgoh From: "Runping Qi" To: Subject: RE: Different Key/Value classes for Map and Reduce? Date: Fri, 31 Mar 2006 07:21:34 -0800 Message-ID: <006301c654d6$cc9d4030$e6e77ecf@ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <442D204E.7070705@softwaremind.pl> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Thread-Index: AcZUvqlyjGDAjyW+QeWwPKFqmEdV0wAF35qw X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N A simple fix is to add another two attributes to JobConf class: mapOutputLeyClass and mapOutputValueClass. That allows the user to have different key/value classes for the intermediate and final outputs. I'll file a bug for this problem. Runping -----Original Message----- From: Darek Zbik [mailto:d.zbik@softwaremind.pl] Sent: Friday, March 31, 2006 4:28 AM To: hadoop-dev@lucene.apache.org Subject: Re: Different Key/Value classes for Map and Reduce? Runping Qi wrote: >When the reducers write the final results out, its output format is obtained >from the job object. By default, it is TextOutputFormat, and no conflicts. >However, if one wants to use SequencialFileFormat for the final results, >then the key/value classes are also obtained from the job object, the same >as the map tasks' output. Now we have a problem. It is impossible for the >map outputs and reducer outputs use different key/value classes, if one >wants the reducers generate outputs in SequentialFileFormat. > > > I have this problem in real situation. I solve it by creating my own output format which is in fact copy-paste of the SequentialFileFormat with small changes (simply a took output class from ohter (my own) job property). I think that each hadoop job shoud have posibility to denote output key/value from reduce task (eg. {set,get}ReducerOutput{Key,Value}). darek