avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Picking up default value for a union?
Date Thu, 11 Apr 2013 04:21:05 GMT
Minor addition, the default value should be




-- the latter is a string, the former is null.


On 4/9/13 8:42 PM, "Martin Kleppmann" <martin@rapportive.com> wrote:

>With Avro, it is generally assumed that your reader is working with
>the exact same schema as the data was written with. If you want to
>change your schema, e.g. add a field to a record, you still need the
>exact same schema as was used for writing (the "writer's schema"), but
>you can also give the decoder a second schema (the "reader's schema"),
>and Avro will map data from the writer's schema into the reader's
>schema for you ("schema evolution").
>This requirement of having the exact same schema as the writer makes
>more sense with Avro's binary encoding, because it allows Avro to omit
>the field names, which makes the encoding very compact. The
>requirement makes less sense if you're using the JSON encoding, where
>field names are inevitably part of the JSON. I think this behaviour is
>expected, but I agree that it's a bit surprising, so perhaps it's
>worth discussing whether we should change it.
>To answer your question, your input data {} looks like it was written
>with a writer schema of {"name":"hey", "type":"record", "fields":[]}
>so try using that as your writer schema. Then if you specify
>{"name":"hey", "type":"record",
>"fields":[{"name":"a","type":["null","string"],"default":"null"}]} as
>your reader schema, you should find that the resolving decoder fills
>in the field "a" with the default null.
>On 9 April 2013 02:44, Jonathan Coveney <jcoveney@gmail.com> wrote:
>> Stepping through the code, it looks like the code only uses defaults for
>> writing, not for reading. IE at read time it assumes that the defaults
>> already filled in. It seems like if the reader evolved the schema to
>> new fields, it would be desirable for the defaults to get filled in if
>> present? But stepping through, on reading the defaults are completely
>> ignored.
>> 2013/4/9 Jonathan Coveney <jcoveney@gmail.com>
>>> Please note: {"name":"hey", "type":"record",
>>> "fields":[{"name":"a","type":["null","string"],"default":"null"}]} also
>>> doesn't work
>>> 2013/4/9 Jonathan Coveney <jcoveney@gmail.com>
>>>> I have the following schema: {"name":"hey", "type":"record",
>>>> "fields":[{"name":"a","type":["null","string"],"default":null}]}
>>>> I am trying to deserialize the following against this schema using
>>>> and the GenericDatumReader: {}
>>>> I get the following error:
>>>> Caused by: org.apache.avro.AvroTypeException: Expected start-union.
>>>>     at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
>>>>     at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
>>>>     at
>>>>     at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>>>     at
>>>>     at
>>>>     at
>>>>     at
>>>>     at
>>>>     at com.spotify.hadoop.JsonTester.main(JsonTester.java:40)
>>>> I'm not seeing any immediate issues online around this...is this
>>>> expected? I'm reading it in as such:
>>>> Schema avroSchema = new Schema.Parser().parse(schemaLine);
>>>> GenericDatumReader<Object> reader = new
>>>> GenericDatumReader<Object>(avroSchema);
>>>> Object datum = reader.read(null,
>>>> DecoderFactory.get().jsonDecoder(avroSchema, dataLine));
>>>> I'm going to see what's up and why it isn't picking up the default,
>>>> imagined you guys might know what's up?
>>>> Thanks,
>>>> Jon

View raw message