avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Normand <alexandre.norm...@gmail.com>
Subject Schema resolution failure when the writer's schema is a primitive type and the reader's schema is a union
Date Fri, 17 Aug 2012 00:59:47 GMT
Hey, 
I've been running into this case where I have a field of type int but I need to allow for
null values. To do this, I now have a new schema that defines that field as a union of null
and int such as:
type: [ "null", "int" ]
According to my interpretation of the spec, avro should resolve this correctly. For reference,
this reads like this (from http://avro.apache.org/docs/current/spec.html#Schema+Resolution):
> if reader's is a union, but writer's is not
> The first schema in the reader's union that matches the writer's schema is recursively
resolved against it. If none match, an error is signaled.)
> 


However, when trying to do this, I get this:


org.apache.avro.AvroTypeException: Attempt to process a int when a union was expected.






I've written a simple test that illustrates what I'm saying:





    @Test


    public void testReadingUnionFromValueWrittenAsPrimitive() throws Exception {


        Schema writerSchema = new Schema.Parser().parse("{\n" +


                "    \"type\":\"record\",\n" +


                "    \"name\":\"NeighborComparisons\",\n" +


                "    \"fields\": [\n" +


                "      {\"name\": \"test\",\n" +


                "      \"type\": \"int\" }]} ");


        Schema readersSchema = new Schema.Parser().parse(" {\n" +


                "    \"type\":\"record\",\n" +


                "    \"name\":\"NeighborComparisons\",\n" +


                "    \"fields\": [ {\n" +


                "      \"name\": \"test\",\n" +


                "      \"type\": [\"null\", \"int\"],\n" +


                "      \"default\": null } ]  }");


        GenericData.Record record = new GenericData.Record(writerSchema);


        record.put("test", Integer.valueOf(10));






        ByteArrayOutputStream output = new ByteArrayOutputStream();


        JsonEncoder jsonEncoder = EncoderFactory.get().jsonEncoder(writerSchema, output);


        GenericDatumWriter<GenericData.Record> writer = new GenericDatumWriter<GenericData.Record>(writerSchema);


        writer.write(record, jsonEncoder);


        jsonEncoder.flush();


        output.flush();






        System.out.println(output.toString());






        JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(readersSchema, output.toString());


        GenericDatumReader<GenericData.Record> reader =


                new GenericDatumReader<GenericData.Record>(writerSchema, readersSchema);


        GenericData.Record read = reader.read(null, jsonDecoder);


        


        assertEquals(10, read.get("test"));


    }







Am I misunderstanding how avro should handle such a case of schema resolution or is the problem
in the implementation?






Cheers!


-- 
Alex


Mime
View raw message