avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Monsch <mon...@alum.mit.edu>
Subject Re: Avro union compatibility mode enhancement proposal
Date Fri, 22 Apr 2016 15:42:04 GMT
The second solution sounds like a great alternative.

Branch aliases are more straightforward than an implicit order-sensitive policy. They also
have the additional benefit of giving users a bit more flexibility: since defaults are specified
on the branches’ types, it is possible to have different branches have different defaults
inside the same union. There are probably a few edge cases (e.g. allowing multiple such aliases
would be useful) but they should be simple to address.

What would be a good attribute name for this? `baseTypes`?

-Matthieu



> On Apr 21, 2016, at 10:52 AM, Doug Cutting <cutting@gmail.com> wrote:
> 
> On Wed, Apr 20, 2016 at 9:09 PM, Ryan Blue <rblue@netflix.com.invalid> wrote:
>> Making the default a property of an
>> inner schema makes me think that we will have to deal with multiple schemas
>> with such a label at some point.
> 
> On Thu, Apr 21, 2016 at 6:54 AM, Matthieu Monsch <monsch@alum.mit.edu> wrote:
>> Delegating default selection to the branches themselves is a great idea but it
>> will be tricky to handle reference branches smoothly. More minor but it also
>> doesn’t feel intuitive to not have the union “own” its default attribute.
> 
> If I understand your concerns correctly, I attempted to address this above:
> 
> "Note however that, when using a record as the default branch, one
> could not then
> use that same record as a non-default branch in another union.  To
> ameliorate that, we might permit multiple default branches in a union
> to be specified as default with the convention that the first such is
> used."
> 
> Does that make sense?
> 
> This isn't ideal syntax, but it's not terrible, and it doesn't change
> schema syntax incompatibly, which seems important, especially when its
> unlikely that all implementations would implement such a syntax change
> in a synchronized manner.
> 
> Alternately, one might annotate each derived record with the name of
> its base record, then one wouldn't need to alter union definitions.
> This would work like an alias.  If a record doesn't exist in the
> reader's schema, then an alias to the missing record would be added in
> the reader's schema to the base record it names in the writer's
> schema.  Aliases work by rewriting the writer's schema at read-time,
> updating names, including those in unions.  Might that work?  It seems
> like perhaps a more elegant approach.  It has compatible syntax and
> only alters behavior of a case that fails today.
> 
> Doug


Mime
View raw message