avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@gmail.com>
Subject Re: Avro union compatibility mode enhancement proposal
Date Mon, 18 Apr 2016 21:13:20 GMT
I was responding to the "For unions, we will add an optional catch-all
attribute" part.  The syntax of union schemas is unfortunately hard to
modify compatibly.

Here's a way around that.  Recall that every kind of schema, except
union, supports arbitrary properties, and unions cannot be directly
nested.  Given this, we could have a property on one of the branch
schemas that declares it as the branch to be used when the writer's
branch is not found in the reader.  For example, if one wanted
unmatched to be read as null, then the reader could use a union like,
[{"type":"null","isDefaultBranch":true}, ... ], or one could add that
property to a record schema that acts as a base class.  Note however
that, when using a record as the default branch, one could not then
use that same record as a non-default branch in another union.  To
ameliorate that, we might permit multiple default branches in a union
to be specified as default with the convention that the first such is
used.

Doug


On Mon, Apr 18, 2016 at 1:18 PM, Ryan Blue <rblue@netflix.com.invalid> wrote:
> Sorry, I thought you were talking about the more recent topic, the enum
> changes. I see what you're saying now about unions.
>
> My initial proposal was not to change how unions are represented in the
> schema, but to update how resolution happens. Right now, we will match a
> read schema to a branch by structure for cases where the record has been
> renamed. This would apply similar logic, allowing a more generic record to
> match.
>
> I wasn't thinking there would be a schema change, though we could certainly
> make that happen to standardize this behavior.
>
> rb
>
> On Mon, Apr 18, 2016 at 1:11 PM, Ryan Blue <rblue@netflix.com> wrote:
>
>> Doug, I don't think I understand. Why would this change a union
>> representation?
>>
>> This wouldn't change the schema format, other than to add an attribute to
>> enum types that is ignored by older readers. New readers will use that
>> attribute to determine which symbol to use when the written symbol is
>> unknown.
>>
>> rb
>>
>> On Mon, Apr 18, 2016 at 12:59 PM, Doug Cutting <cutting@gmail.com> wrote:
>>
>>> Perhaps then its sufficient to only write the new schema format when
>>> the new attribute is specified, so existing apps will continue to
>>> represent unions as JSON arrays?  If so, this should probably be
>>> written into the spec.
>>>
>>> On Mon, Apr 18, 2016 at 12:52 PM, Ryan Blue <rblue@netflix.com.invalid>
>>> wrote:
>>> > Isn't the problem that these changes aren't compatible right now
>>> anyway? If
>>> > I need to add an entry to an enum right now, older readers fail when
>>> trying
>>> > to handle that data. This creates a way to avoid that failure in new
>>> > versions.
>>> >
>>> > On Mon, Apr 18, 2016 at 12:48 PM, Doug Cutting <cutting@gmail.com>
>>> wrote:
>>> >
>>> >> On Sun, Apr 17, 2016 at 2:00 PM, Matthieu Monsch <monsch@alum.mit.edu>
>>> >> wrote:
>>> >> > + For unions, we will add an optional catch-all attribute to mark
a
>>> >> branch as resolution target when no names or aliases match (and come
up
>>> >> with the corresponding syntax).
>>> >>
>>> >> Can this be compatible?  If you add a new union syntax (e.g.,
>>> >> {"type":"union", "branches":[...], "default":...}) then existing
>>> >> implementations will not be able to read new data that uses this
>>> >> feature.
>>> >>
>>> >> Doug
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Ryan Blue
>>> > Software Engineer
>>> > Netflix
>>>
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix

Mime
View raw message