avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliot West <tea...@gmail.com>
Subject Re: Evolving schemas and namespaces
Date Wed, 07 Dec 2016 13:27:13 GMT
Hi Anders,

Thanks for taking the time to explain the use case. I can see why it would
be useful to use a version identifier in the generation of Java classes
from Avro, specifically as an extra coordinate in the Java namespace for
differentiating between versions. However, achieving this by manipulating
the Avro namespace feels like the wrong way to go about it. It is not the
practice to version Java code by creating new package names on each
release. I'd suggest that the determination of the target package for
generation of Java classes is a responsibility of the code generation
implementation, not something that should be defined in the more general
schema specification. Perhaps the code generation implementation can be
modified to take account of some additional metadata contained in the
schema instead of attempting to overload the purpose of the namespace.

{
  "type" : "record",
  "name" : "Foo",
  "namespace" : "nl.example.evoleschema",
  "version" : "v1",
  ...

Note that the version metadata does not need to form part of the Avro
schema spec as I believe that it already supports/ignores such elements.

Thanks,

Elliot.

On 7 December 2016 at 10:00, Anders Sundelin <anders.sundelin@ericsson.com>
wrote:

> Hi Elliot,
>
> Schema compatibility (as defined by Avro) is not commutative, i.e. one
> schema can be read-compatible with another (writer) schema, while the
> reverse is not necessarily true.
>
> This is actually a good thing, because it means that schemas can evolve,
> taking into consideration that some clients can read certain schema
> versions, while others can read possibly more versions.
>
> The obvious use case is for a server, performing work on behalf of several
> clients.
>
> Each client uses schema negotiation, so when the server reads a message
> with schema S-1.7, ita can process it (possibly with its own schema S-1.8),
> but the response should be sent back also using the S-1.7 schema (since
> that was what the client had negotiated).
>
> Clients using schema S-1.8 should of course get their response in the
> S-1.8 schema and so on.
>
> In short, this kind of use-case requires the server (in its interface
> part) to be aware what schemas that the clients speak, even if it
> internally uses the "superset schema", where all schemas can be read (i.e.
> typically, the "superset schema" is read-compatible with all different
> versions, typically since the superset schema is the latest of these
> schemas).
>
> BR
>
> /Anders
>
>
> On 2016-12-02 17:01, Elliot West wrote:
>
> Hi Anders,
>
> If the one schema is a compatible evolution of the other, what is the need
> for multiple Java types? Schema compatibility implies that data written by
> one schema can be safety marshalled to/from a Java class generated another
> compatible version no?. Obviously the specific behaviours are dictated by
> the selected compatibility level, the version of Avro data being
> encoded/decoded, and the version of the schema from which that class was
> generated. If the compatibility rules are followed I would not expect to
> need multiple Java representations.
>
> That said, I haven't tested my expectations concretely :-)
>
> Elliot.
>
> On 2 December 2016 at 15:43, Anders Sundelin <anders.sundelin@ericsson.com
> > wrote:
>
>> Hi Niels and Elliot,
>>
>> Thinking from the Java perspective now, the nice thing about namespaces
>> (and, to a lesser degree, the name itself) is that they are mapped to
>> packages (and classnames).
>>
>> If the v1 Avro spec was used as a base for generating java classes, then
>> the corresponding Java class would then be "com.example.some.v1.MyType",
>>
>> and, correspondingly, the v2 Avro schema would generate into
>> "com.example.some.v2.MyType"
>>
>> In other words, clients (or servers) could happily use both classes while
>> talking to different other peers (supporting, for instance, protocol
>> negotiation).
>>
>> This is the most important reason for why I think the rule you mention is
>> kind of strange.
>>
>> What do you others think?
>>
>> BR
>>
>> /Anders
>>
>>
>>
>> On 2016-12-02 16:33, Elliot West wrote:
>>
>> There is perhaps a little ambiguity in the spec:
>>
>> From https://avro.apache.org/docs/current/spec.html#names
>>   Record, enums and fixed are named types. Each has a fullname that is
>> composed of two parts; a name and a namespace.* Equality of names is
>> defined on the fullname*.
>>
>> From https://avro.apache.org/docs/current/spec.html#Schema+Resolution:
>>   It is an error if the two schemas do not match.
>>   To match, one of the following must hold:
>>   ...
>>   *both schemas are records with the same name*
>>   ...
>>
>> I suspect that in this case 'name' means 'fullname' and therefore by
>> choosing a different namespace you've declared to Avro that they should be
>> considered different types.
>>
>> If you are trying to annotate different schemas with a version
>> identifier, perhaps a 'doc' property might be more appropriate?
>>
>> On 2 December 2016 at 15:11, Niels Basjes < <Niels@basjes.nl>
>> Niels@basjes.nl> wrote:
>>
>>> Hi,
>>>
>>> When I run the code below the output indicates that these two are incompatible
in terms of schema evolution.
>>>
>>> The ONLY difference is the namespace (v1 and v2).
>>>
>>> If I remove the namespace line the are reported as 'compatible'.
>>>
>>> My question is why these two are considered to be incompatible?
>>>
>>> @Testpublic void evolveTest() throws IOException {
>>>   Schema schemaV1 = new Schema.Parser().parse("{\n" +
>>>     "  \"type\" : \"record\",\n" +
>>>     "  \"name\" : \"Foo\",\n" +
>>>     "  \"namespace\" : \"nl.example.evoleschema.v1\",\n" +
>>>     "  \"fields\" : [ {\n" +
>>>     "    \"name\" : \"count\",\n" +
>>>     "    \"type\" : {\n" +
>>>     "      \"type\" : \"enum\",\n" +
>>>     "      \"name\" : \"Bar\",\n" +
>>>     "      \"symbols\" : [ \"ONE\", \"TWO\", \"THREE\" ]\n" +
>>>     "    }\n" +
>>>     "  } ]\n" +
>>>     "}");
>>>
>>>   Schema schemaV2 = new Schema.Parser().parse("{\n" +
>>>     "  \"type\" : \"record\",\n" +
>>>     "  \"name\" : \"Foo\",\n" +
>>>     "  \"namespace\" : \"nl.example.evoleschema.v2\",\n" +
>>>     "  \"fields\" : [ {\n" +
>>>     "    \"name\" : \"count\",\n" +
>>>     "    \"type\" : {\n" +
>>>     "      \"type\" : \"enum\",\n" +
>>>     "      \"name\" : \"Bar\",\n" +
>>>     "      \"symbols\" : [ \"ONE\", \"TWO\", \"THREE\" ]\n" +
>>>     "    }\n" +
>>>     "  } ]\n" +
>>>     "}");
>>>
>>>   LOG.info("{}", SchemaCompatibility.checkReaderWriterCompatibility(schemaV1,
schemaV2).getType());
>>>   LOG.info("{}", SchemaCompatibility.checkReaderWriterCompatibility(schemaV2,
schemaV1).getType());
>>> }
>>>
>>> --
>>> Best regards / Met vriendelijke groeten, Niels Basjes
>>>
>>

Mime
View raw message