avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Zeyliger <phi...@cloudera.com>
Subject Re: A case for adding revision field to Avro schema
Date Wed, 22 Sep 2010 06:25:52 GMT
On Tue, Sep 21, 2010 at 8:00 PM, Thiruvalluvan M. G. <thiru_mg@yahoo.com>wrote:

> Thanks Philip for your crisp description of what happens with Thrift and
> PB.
> I had assumed that the community knows the difference between those systems
> and Avro. your description should help those who don't know and be
> refresher
> for those who know.
> > What happens if the application doesn't recognize the branch number?  If
> > you're a client of a get_person(id) call, and you were written when
> > Person_v1 was the only one in existence, Avro, today, would do just fine
> at
> > projecting Person_v2 down into Person_v1 for you.  That's because your
> > reader schema would be v1, and you'd read some data written with v2, and
> > those are compatible.  If you have a "version id", then it's hard to go
> do
> > compatibility of old readers reading new data.
> My proposal was:
> "3. Schemas match as per the current matching rules, even if the revisions
> do not match."
> That is, since Person_v2 and Person_v1 have the same name "Person" and
> different revisions v2 and v1, they would match according to the current
> rules.

I'm beginning to understand your proposal a little bit better.  What happens
when the revisions aren't linear?  (Or do we require them to be?)

For example:

Writer's Schema union:
Person_a: (name)
Person_b: (name, age)

Reader's Schema union:
Person_c: (age)
Person_d: (age, school [default=""])

When "Person_b, Philip, 28" is written, what would a subsequent reader see?

I'm worried that the semantics of reader and writer schemas are already
complicated enough; adding in sets of schemas makes it even trickier.

-- Philip

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message