From Chris Laws <clawsi...@gmail.com>
Subject Schema evolution and projection
Date Thu, 28 Feb 2013 13:21:42 GMT

I am struggling to familiarise myself with schema evolution and schema
projection using the avro-c implementation.

There doesn't seem to be much information available on how to perform these
tasks. The examples on the C API page confusingly mix the old datum API
with the new value API.

I have built what I think is a really simple example of testing schema
projection but it does not work the way I think it should work - more than
likely my understanding is wrong.

Where I ask for one particular field (by specifying the field name) of a
record to be retrieved I instead get every field that matches the request

The attached file projection_01.c (attached and at
https://gist.github.com/claws/5056626) defines a really simple record with
If I avrocat the container file I see:
{"Field_1": 1, "Field_2": 1}
{"Field_1": 2, "Field_2": 2}
{"Field_1": 3, "Field_2": 3}
{"Field_1": 4, "Field_2": 4}
{"Field_1": 5, "Field_2": 5}

The projection schema being used is a record only containing Field_2 of
type int. I only expected that field to be returned by the reader yet I
receive every int type field, confusingly labelled as "Field_2".

When I run projection_01.c I see:
{"Field_2": 1}
{"Field_2": 1}
{"Field_2": 2}
{"Field_2": 2}
{"Field_2": 3}
{"Field_2": 3}
{"Field_2": 4}
{"Field_2": 4}
{"Field_2": 5}
{"Field_2": 5}

Is this how schema projection is supposed to work? Does it just return
items of the same type irrespective of the field name specified?

I think I am missing something about how this is supposed to work. Perhaps
my example record is too simple.

So, I then created a slightly more complex schema that contained a
sub-record and the projection seems to work how I think it should work.
This can be seen in the output from projection_02.c (attached and at
https://gist.github.com/claws/5056643) which returns:
{"Field_2": {"SubField_1": 1, "SubField_2": 42}}
{"Field_2": {"SubField_1": 24, "SubField_2": 3}}
{"Field_2": {"SubField_1": 2, "SubField_2": 42}}
{"Field_2": {"SubField_1": 24, "SubField_2": 3}}
{"Field_2": {"SubField_1": 3, "SubField_2": 42}}
{"Field_2": {"SubField_1": 24, "SubField_2": 3}}
{"Field_2": {"SubField_1": 4, "SubField_2": 42}}
{"Field_2": {"SubField_1": 24, "SubField_2": 3}}
{"Field_2": {"SubField_1": 5, "SubField_2": 42}}
{"Field_2": {"SubField_1": 24, "SubField_2": 3}}

>From this trial and error it appears that the projection will return me
values that match the projection schema's types - but does not take into
account any 'name' fields. Would that be an accurate assessment?

Can anyone provide some more information on schema projection?
Is there a good example anywhere?


