avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Laws <clawsi...@gmail.com>
Subject Re: Schema evolution and projection
Date Fri, 01 Mar 2013 22:26:15 GMT
Martin,

Yes, I had declared the reader schema used for my evolution test to have a
default value and to be a union with null. Apologies for not including that
information in my earlier post.

It makes sense for my applications to receive a default value rather than a
null so in my extension to the example I have made the new field a union
with null but set a default of an integer value.

I thought that I should be able to use the same example code Douglas
Creager provided that demonstrates schema projection - because, if I
understand correctly, it is performing the necessary resolution whether for
projection or evolution.

So if I stick with the resolver-writer.c example and I declare a new schema
that has an extra field:

#define READER_SCHEMA_C \
    "{" \
    "  \"type\": \"record\"," \
    "  \"name\": \"test\"," \
    "  \"fields\": [" \
    "    { \"name\": \"a\", \"type\": \"int\" }," \
    "    { \"name\": \"b\", \"type\": \"int\" }," \
    "    { \"name\": \"c\", \"type\": [\"null\", \"int\"], \"default\": 42
}" \
    "  ]" \
    "}"

and then use it in the resolver-writer.c code:

    printf("Reading evolved data with schema resolution, showing new field
\"c\"...\n");
    read_with_schema_resolution(FILENAME, READER_SCHEMA_C, "c");

I get:

Reading evolved data with schema resolution, showing new field "c"...
Error: Reader field c doesn't appear in writer

I was under the impression that I should have received the default value of
42 for field 'c' for each item in the data file.

BTW, I had come across your blog post in my Avro research. I found it very
useful.

Regards,
Chris


On Sat, Mar 2, 2013 at 12:23 AM, Martin Kleppmann <martin@rapportive.com>wrote:

> Chris,
>
> If you want a field in your reader schema that is not present in your
> writer schema, you have to set a default value — otherwise the reader
> has no way of knowing how to fill in that Field_3! If no particular
> default value makes sense, a standard technique is to make the field
> type a union with null, and to make null the default value
> (effectively making the field optional).
>
> For example:
>
> const char  EXTENDED_SCHEMA[] =
> "{\"type\":\"record\",\
>   \"name\":\"SimpleScehma\",\
>   \"fields\":[\
>      {\"name\": \"Field_1\", \"type\": \"int\"},\
>      {\"name\": \"Field_2\", \"type\": \"int\"},\
>      {\"name\": \"Field_3\", \"type\": [\"null\", \"int\"],
> \"default\": null}]}";
>
> To build your intuitive understanding of how schema evolution works,
> you might find this post useful:
>
> http://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
>
> Best,
> Martin
>
> On 1 March 2013 01:50, Chris Laws <clawsicus@gmail.com> wrote:
> > Doug,
> >
> > I have updated my test code in line with your excellent example and I now
> > have the projection aspect working well.
> >
> > Now... I'm stuck on a schema evolution test. Basically if I use your
> example
> > as the foundation and I create a new schema based on the WRITER_SCHEMA in
> > which I add a new field to the end (to model schema evolution) I receive
> an
> > error when trying to create the writer_iface.
> >
> > writer_iface = avro_resolved_writer_new(writer_schema, reader_schema);
> >
> > "Reader field Field_3 doesn't appear in writer"
> >
> > Any chance you could extending your example to show the ability of Avro
> to
> > read data from a data file using an evolved schema (say in a simple
> > situation were a new field is added to the schema)?
> >
> > Regards,
> > Chris
> >
> >
> >
> > On Fri, Mar 1, 2013 at 9:08 AM, Douglas Creager <douglas@creagertino.net
> >
> > wrote:
> >>
> >> > Thanks for the informative reply. I look forward to the example code,
> >> > that is exactly what I'm after.
> >> >
> >> > I'm really struggling with my schema evolution testing. I thought I'd
> >> > post a question about schema projection because it seemed simpler but
> I
> >> > guess it also rests on creating a resolver. I have not found a clear
> and
> >> > simple example of how to do it using avro-c. I've trawled the test
> code
> >> > for examples but as I mention I can't find a clear and simple example.
> >>
> >> Alrighty, here you go:
> >>
> >> http://dcreager.github.com/avro-examples/resolved-writer.html
> >>
> >> And a git repo with the source code:
> >>
> >> https://github.com/dcreager/avro-examples/tree/master/resolved-writer
> >>
> >> I hope this helps — please let me know if you have any other questions.
> >>
> >> cheers
> >> –doug
> >>
> >
>

Mime
View raw message