Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F15806B11 for ; Fri, 20 May 2011 17:44:26 +0000 (UTC) Received: (qmail 93071 invoked by uid 500); 20 May 2011 17:44:26 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 93035 invoked by uid 500); 20 May 2011 17:44:26 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 93027 invoked by uid 99); 20 May 2011 17:44:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2011 17:44:26 +0000 X-ASF-Spam-Status: No, hits=1.1 required=5.0 tests=NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2011 17:44:18 +0000 Received: from sp1-ex07cas01.ds.corp.yahoo.com (sp1-ex07cas01.ds.corp.yahoo.com [216.252.116.137]) by mrout1-b.corp.re1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id p4KHhfEW026518 for ; Fri, 20 May 2011 10:43:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=yahoo-inc.com; s=cobra; t=1305913421; bh=0KOWfISeNDbJ+WDj87V6mg2qkNxBK9LAj5adBsL1cFU=; h=From:To:Date:Subject:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version; b=eKrsGRLSJrsxHx5wPn9KcqqdPyRdQD/swK1pt3ixrhI+8lWiFQERh666mlIfdytRT 0emQoEmQbAADG5AgeR/4u0WMlGIWCJU+P+N3SUKHyFBQXOnM9wFEiW0sadpmuhC9v5 Cc3UYQ8fNnvntgCVYYzIkJNazYnKxfutwBakH/r4= Received: from SP1-EX07VS02.ds.corp.yahoo.com ([216.252.116.135]) by sp1-ex07cas01.ds.corp.yahoo.com ([216.252.116.137]) with mapi; Fri, 20 May 2011 10:43:41 -0700 From: Markus Weimer To: "user@avro.apache.org" Date: Fri, 20 May 2011 10:43:40 -0700 Subject: Re: Multiple input schemas in MapReduce? Thread-Topic: Multiple input schemas in MapReduce? Thread-Index: AcwXFXTI9VPKbLlfS+q8003R98SF2w== Message-ID: References: <0AB1AA48-06F4-457F-9448-6399D46F9775@yahoo-inc.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hi, just an update: The solution below does, indeed, work as expected. Thanks! Markus On May 11, 2011, at 3:00 PM, Jacob R Rideout wrote: > We do take the union schema approach, but create the unions > programmaticly in java: >=20 > Something like: >=20 > ArrayList schemas =3D new ArrayList(); > schemas.add(schema1); > schemas.add(schema2); > Schema unionSchema =3D Schema.createUnion(schemas); > AvroJob.setInputSchema(job, unionSchema); >=20 >=20 > On Wed, May 11, 2011 at 12:44 PM, Markus Weimer wr= ote: >> Hi, >>=20 >> I'd like to write a mapreduce job that uses avro throughout, but the map= phase would need to read files with two different schemas, similar to what= the MultipleInputFormat does in stock hadoop. Is this a supported use case= ? >>=20 >> A work-around would be to create a union schema that has both fields as = optional and to convert all data into it, but that seems clumsy. >>=20 >> Has anyone done this before? >>=20 >> Thanks for any suggestion you can give, >>=20 >> Markus >>=20 >>=20