avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schulte <johannes.schu...@gmail.com>
Subject Re: Avro with MultipleInputs in the new API
Date Sat, 25 Feb 2017 22:34:51 GMT
If you know all schemas in the driver you can use a union schema of all
types you are gonna read. So something like

AvroJob.setInputKeySchema(Schema.createUnion(ImmutableList.of(schema1,
schema2)))

will work. In the mapper you just pass through the data and in the reducer
you can react on the record via instanceof or schema name; this depends a
little bit on the data model you are using (reflect, specific, generic)


On Thu, Feb 23, 2017 at 5:36 PM, Zephod <zephod@tlen.pl> wrote:

> I'm trying to get a reducer that accept input from multiple mappers which
> get
> data from different locations. This can be easily done in plain Hadoop with
> MultipleInputs and also in Avro old API (mapred) with AvroMultipleInputs,
> but there is nothing in the new API (mapreduce). How can I achieve that
> same
> functionality?
>
>
>
> --
> View this message in context: http://apache-avro.679487.n3.
> nabble.com/Avro-with-MultipleInputs-in-the-new-API-tp4036886.html
> Sent from the Avro - Users mailing list archive at Nabble.com.
>

Mime
View raw message