avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Campbell <ja...@breachintelligence.com>
Subject Reading from disjoint schemas in map
Date Tue, 13 May 2014 20:03:55 GMT
I'm trying to read data into a mapreduce job, where the data may have been created by one of
a few different schemas, none of which are evolutions of one another (though they are related).

I have seen several people suggest using a union schema, such that during job setup, one would
set the input schema to be the union:
ArrayList<Schema> schemas = new ArrayList<Schema>();
schemas.add(schema1);
...
Schema unionSchema = Schema.createUnion(schemas);
AvroJob.setInputKeySchema(job, unionSchema);

However, I don't know how to then extract the correct type inside my mapper (which was apparently
trivial (sorry-I'm new to avro)).

I'd guess that the map function profile becomes map(AvroKey<GenericRecord> key, NullWritable
value, ...) but how can I then cause Avro to read the correctly-typed data from the GenericRecord?

Thanks!

James

Mime
View raw message