avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Schema resolution failure when the writer's schema is a primitive type and the reader's schema is a union
Date Fri, 31 Aug 2012 06:01:05 GMT
My understanding of the spec is that promotion to a union should work as
long as the prior type is a member of the union.

What happens if the union in the reader schema union order is reversed?

This may be a bug.

-Scott

On 8/16/12 5:59 PM, "Alexandre Normand" <alexandre.normand@gmail.com>
wrote:


>Hey, 
>I've been running into this case where I have a field of type int but I
>need to allow for null values. To do this, I now have a new schema that
>defines that field as a union of
>null and int such as:
>type: [ "null", "int" ]
>According to my interpretation of the spec, avro should resolve this
>correctly. For reference, this reads like this (from
>http://avro.apache.org/docs/current/spec.html#Schema+Resolution):
>
>if
> reader's is a union, but writer's is not
>The first schema in the reader's union that matches the writer's schema
>is recursively resolved against it. If none match, an error is signaled.)
>
>
>However, when trying to do this, I get this:
>org.apache.avro.AvroTypeException: Attempt to process a int when a union
>was expected.
>
>I've written a simple test that illustrates what I'm saying:
>    @Test
>    public void testReadingUnionFromValueWrittenAsPrimitive() throws
>Exception {
>        Schema writerSchema = new Schema.Parser().parse("{\n" +
>                "    \"type\":\"record\",\n" +
>                "    \"name\":\"NeighborComparisons\",\n" +
>                "    \"fields\": [\n" +
>                "      {\"name\": \"test\",\n" +
>                "      \"type\": \"int\" }]} ");
>        Schema readersSchema = new Schema.Parser().parse(" {\n" +
>                "    \"type\":\"record\",\n" +
>                "    \"name\":\"NeighborComparisons\",\n" +
>                "    \"fields\": [ {\n" +
>                "      \"name\": \"test\",\n" +
>                "      \"type\": [\"null\", \"int\"],\n" +
>                "      \"default\": null } ]  }");
>        GenericData.Record record = new GenericData.Record(writerSchema);
>        record.put("test", Integer.valueOf(10));
>
>        ByteArrayOutputStream output = new ByteArrayOutputStream();
>        JsonEncoder jsonEncoder =
>EncoderFactory.get().jsonEncoder(writerSchema, output);
>        GenericDatumWriter<GenericData.Record> writer = new
>GenericDatumWriter<GenericData.Record>(writerSchema);
>        writer.write(record, jsonEncoder);
>        jsonEncoder.flush();
>        output.flush();
>
>        System.out.println(output.toString());
>
>        JsonDecoder jsonDecoder =
>DecoderFactory.get().jsonDecoder(readersSchema, output.toString());
>        GenericDatumReader<GenericData.Record> reader =
>                new GenericDatumReader<GenericData.Record>(writerSchema,
>readersSchema);
>        GenericData.Record read = reader.read(null, jsonDecoder);
>        
>        assertEquals(10, read.get("test"));
>    }
>
>Am I misunderstanding how avro should handle such a case of schema
>resolution or is the problem in the implementation?
>
>Cheers!
>
>-- 
>Alex



Mime
View raw message