avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schulte <johannes.schu...@gmail.com>
Subject Re: Avro with MultipleInputs in the new API
Date Tue, 28 Feb 2017 12:57:32 GMT
I suggest you use Mapper<AvroKey<Object>,NullWritable> or even
Mapper<AvroKey<SpecificRecord>,NullWritable>

You can than decide on sorting key by checking the datum() of the avro key
with instanceof or with SpecificRecord#getSchema()#getName in the latter
case.
The schema for the map output (AvroJob#setMapOutputKeySchema()) will be the
union again and the key depends on your use case. The sorting by default
should use the provided key for sorting the (key,[values])



On Mon, Feb 27, 2017 at 6:48 AM, Zephod <zephod@tlen.pl> wrote:

> Dear Johannes,
> Thank you for your suggestion :) I know the types in advance (data model is
> specific). Where could I read how the union type works? Are there any
> examples e.g.:
> -what kind of types should I use for type parameters of the mapper/reducer?
> -how the values will be sorted in the shuffle phase?
> -how do I check what type are the values I'm iterating over in the reducer?
>
>
>
>
>
> --
> View this message in context: http://apache-avro.679487.n3.
> nabble.com/Avro-with-MultipleInputs-in-the-new-API-tp4036886p4036890.html
> Sent from the Avro - Users mailing list archive at Nabble.com.
>

Mime
View raw message