avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vyacheslav Zholudev <vyacheslav.zholu...@gmail.com>
Subject Re: Union of Records Issue
Date Tue, 10 Jan 2012 21:45:52 GMT
Can it be related to https://issues.apache.org/jira/browse/AVRO-966 ?
Does the patch help?

Vyacheslav

On Jan 10, 2012, at 10:21 PM, Uhlig, Hans wrote:

> I am creating a dynamic union of records as seen below but keep receiving an exception
org.apache.avro.UnresolvedUnionException: Not in union
> Any reason why it deems the same schemas that created the union invalid for collection?
Avro throws this with each record it tries to collect. An example of this working would be
appreciated.
>  
> Also, is there such a thing as a nullrecord, The records I am assembling fit into a set
instead of a Map but I could find no elegent way outside of defining a record with a single
field of null.
> 
> inside ToolRunnner 
> Schema.Parser p = new Schema.Parser(); 
>         
> ArrayList<Schema> keySchemas = new ArrayList<Schema>(); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc"))); 
>         
> Schema keySchema = Schema.createUnion(keySchemas); 
> Schema valSchema = p.parse(AvroConverter.class.getResourceAsStream("null.avsc")); 
> AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema)); 
> 
> Inside Mapper Setup: 
> private static HashMap<String, Schema> keySchemas = new HashMap<String, Schema>();

> private static Schema valSchema; 
> Schema.Parser p = new Schema.Parser(); 
> keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc"))); 
> keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc"))); 
> keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc"))); 
> keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc"))); 
> keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc"))); 
> valSchema = p.parse(Map.class.getResourceAsStream("null.avsc")); 
> 
> Inside Map function: 
> GenericData.Record r; 
> if(in.type=="s1") { 
> r = new GenericData.Record(keySchemas.get("s1"); 
> } else if(in.type=="s1") { 
> r = new GenericData.Record(keySchemas.get("s2"); 
> } 
> oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new
GenericData.Record(valSchema))); 
> 
> Avro throws a Union Exception everytime I pass in a record. Any reason why it deems the
same schemas that created the union invalid for collection? 
> 
> org.apache.avro.UnresolvedUnionException: Not in unionI am creating a dynamic union of
records as seen below 
> 
> inside ToolRunnner 
> Schema.Parser p = new Schema.Parser(); 
>         
> ArrayList<Schema> keySchemas = new ArrayList<Schema>(); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc"))); 
> keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc"))); 
>         
> Schema keySchema = Schema.createUnion(keySchemas); 
> Schema valSchema = p.parse(AvroConverter.class.getResourceAsStream("null.avsc")); 
> AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema)); 
> 
> Inside Mapper Setup: 
> private static HashMap<String, Schema> keySchemas = new HashMap<String, Schema>();

> private static Schema valSchema; 
> Schema.Parser p = new Schema.Parser(); 
> keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc"))); 
> keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc"))); 
> keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc"))); 
> keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc"))); 
> keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc"))); 
> valSchema = p.parse(Map.class.getResourceAsStream("null.avsc")); 
> 
> Inside Map function: 
> GenericData.Record r; 
> if(in.type=="s1") { 
> r = new GenericData.Record(keySchemas.get("s1"); 
> } else if(in.type=="s1") { 
> r = new GenericData.Record(keySchemas.get("s2"); 
> } 
> oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new
GenericData.Record(valSchema))); 
> 
> 
> 
> 

Best,
Vyacheslav




Mime
View raw message