Return-Path: Delivered-To: apmail-avro-user-archive@www.apache.org Received: (qmail 4899 invoked from network); 18 Aug 2010 13:15:46 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Aug 2010 13:15:46 -0000 Received: (qmail 26186 invoked by uid 500); 18 Aug 2010 13:15:45 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 25966 invoked by uid 500); 18 Aug 2010 13:15:43 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 25956 invoked by uid 99); 18 Aug 2010 13:15:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Aug 2010 13:15:42 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eychih@hotmail.com designates 65.54.51.81 as permitted sender) Received: from [65.54.51.81] (HELO snt0-omc3-s44.snt0.hotmail.com) (65.54.51.81) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Aug 2010 13:15:34 +0000 Received: from SNT113-W4 ([65.55.90.135]) by snt0-omc3-s44.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 18 Aug 2010 06:15:13 -0700 Message-ID: Content-Type: multipart/alternative; boundary="_512c1dc0-9c7b-4c57-9664-692cfea02764_" X-Originating-IP: [67.188.169.10] From: ey-chih chow To: Subject: RE: how to specify MultipleOutputs, MultipleInputs in using Avro mapred API Date: Wed, 18 Aug 2010 06:15:12 -0700 Importance: Normal In-Reply-To: References: ,, MIME-Version: 1.0 X-OriginalArrivalTime: 18 Aug 2010 13:15:13.0395 (UTC) FILETIME=[646D0C30:01CB3ED7] --_512c1dc0-9c7b-4c57-9664-692cfea02764_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi=2C Let me rephrase my question to see if anybody is interested in answering it= . For the new version of Avro 1.4.0=2C the class hierarchy of AvroMapper a= nd AvroReducer have been changed to subclass from Configured=2C rather than= from MapReduceBase to implement the interfaces Mapper and Reducer respecti= vely. The configuration of Avro mapred jobs are also different from that o= f the other mapred jobs. Furthermore=2C text log files have to be imported= to become Avro formats for Avro mapred jobs to process. If I get a mapred= job that requires a reducer-side join of a two inputs=2C one from HBase an= d the other from an imported log file with the Avro format=2C how can I con= figure the two mappers to process inputs from HBase and the log file respec= tively? Also how can I configure an Avro reducer to generate multiple outp= uts? For multiple inputs and outputs=2C I got some examples programs from = Tom White's Hadoop book. But I simply don't know what kind of changes I sh= ould make for the Avro case. =20 Ey-Chih =20 From: eychih@hotmail.com To: user@avro.apache.org Subject: how to specify MultipleOutputs=2C MultipleInputs in using Avro map= red API Date: Mon=2C 16 Aug 2010 18:22:24 -0700 Hi=2C I got a Map/Reduce job that require multiple inputs and outputs. One of th= e inputs will be processed by a mapper and a reducer that are subclasses of= AvroMapper/AvroReducer respectively. And the reducer has multiple outputs= . I appreciate if anybody could let me know how to configure the job to do= this. Ey-Chih = --_512c1dc0-9c7b-4c57-9664-692cfea02764_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi=2C

Let me rephrase my question to see if anybody is i= nterested in answering it.  =3BFor the new version of Avro 1.4.0=2C the= class hierarchy of AvroMapper and AvroReducer have been changed to subclas= s from Configured=2C rather than from MapReduceBase to implement the interf= aces Mapper and Reducer respectively.  =3BThe configuration of Avro map= red jobs are also different from that of the other mapred jobs.  =3BFur= thermore=2C text log files have to be imported to become Avro formats for A= vro mapred jobs to process.  =3BIf I get a mapred job that requires a r= educer-side join of a two inputs=2C one from HBase and the other from an im= ported log file with the Avro format=2C how can I configure the two mappers= to process inputs from HBase and the log file respectively?  =3BAlso h= ow can I configure an Avro reducer to generate multiple outputs?  =3BFo= r multiple inputs and outputs=2C I got some examples programs from Tom Whit= e's Hadoop book.  =3BBut I simply don't know what kind of changes I sho= uld make for the Avro case.  =3B =3B

Ey-Ch= ih  =3B


From: eychih@hotmail.com
To: = user@avro.apache.org
Subject: how to specify MultipleOutputs=2C Multiple= Inputs in using Avro mapred API
Date: Mon=2C 16 Aug 2010 18:22:24 -0700<= br>
Hi=2C

I got a Map/Reduce job that require multiple input= s and outputs.  =3BOne of the inputs will be processed by a mapper and = a reducer that are subclasses of AvroMapper/AvroReducer respectively.  = =3BAnd the reducer has multiple outputs.  =3BI appreciate if anybody co= uld let me know how to configure the job to do this.

Ey-Chih  =3B =3B
= --_512c1dc0-9c7b-4c57-9664-692cfea02764_--