avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Taton <christophe.ta...@gmail.com>
Subject Re: Effort towards Avro 2.0?
Date Wed, 04 Dec 2013 18:24:06 GMT
On Tue, Dec 3, 2013 at 7:49 AM, Doug Cutting <cutting@apache.org> wrote:

> On Mon, Dec 2, 2013 at 1:42 PM, Christophe Taton
> <christophe.taton@gmail.com> wrote:
> > - New extension data type, similar to ProtocolBuffer extensions
> (incompatible change).
>
> Extensions might be implemented as something like:
>
>   {"type":"record", "name":"extension", "fields":[
>     {"name":"fingerprint", "type": {"type":"fixed", "size":16}},
>     {"name":"payload", "type":"bytes"}
>     ]
>   }
>
> One could then use this with:
>
>   {"type":"record", "name":"Foo", "fields":[
>     {"name":"bar", "type":"extension"}
>     ]
>   }
>
> The implementation could then find the schema for the extension at
> runtime given its fingerprint.  The reader could have a table mapping
> fingerprints to schemas.
>
> In particular, the specific compiler, when it sees a schema like:
>
>
>   {"type":"record", "name":"Bar", "isExtension":true, "fields":[
>     {"name":"x", "type":"long"}
>     ]
>   }
>
> Might emit code to add entries to the extension mapping table used by
> SpecificDatumReader, e.g.:
>
>   static {
>     SpecificData.addExtension(getSchema());
>   }
>
> Might something like this work?
>

Yes, this is very much the idea.
In a prototype I made a few months ago, I found allowing the user to
specify the fingerprint schema useful : in some scenario, an extension
could be prefixed by a string that contains the JSON schema; in some other
scenario, I may want to use fingerprints to identify the schema of the
extension; in some other cases, I may want to use some external mapping
maintained by another system (eg. the schema repository worked on in
AVRO-1124).

C.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message