avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Record extensions?
Date Tue, 12 Jun 2012 18:13:49 GMT
On Tue, Jun 12, 2012 at 10:38 AM, Christophe Taton <taton@wibidata.com> wrote:
> I need my server to handle records with fields that can be "freely" extended
> by users, without requiring a recompile and restart of the server.
> The server itself does not need to know how to handle the content of this
> extensible field.
>
> One way to achieve this is to have a bytes field whose content is managed
> externally, but this is very ineffective in many ways.
> Is there a another way to do this with Avro?

You could use a very generic schema, like:

{"type":"record", "name":"Value", fields: [
  {"name":"value", "type": ["int","float","boolean", ...
{"type":"map", "values":"Value"}}
]}

This is roughly equivalent to a binary encoding of JSON.  But by using
a map it forces the serialization of a field name with every field
value.  Not only does that make payloads bigger but it also makes them
slower to construct and parse.

Another approach is to include the Avro schema for a value in the record, e.g.:

{"type":"record", "name":"Extensions", fields: [
  {"name":"schema", type: "string"},
  {"name":"values", "type": {"type":"array", "items":"bytes"}}
]}

This can make things more compact when there are a lot of values.  For
example, this might be used in a search application where each query
lists the fields its interested in retrieving and each response
contains a list of records that match the query and contain just the
requested fields.  The field names are not included in each match, but
instead once for entire set of matches, making this faster and more
compact.

Finally, if you have a stateful connection then you can send send a
schema in the first request then just send bytes encoding instances of
that schema in subsequent requests over that connection.  This again
avoids sending field names with each field value.

Doug

Mime
View raw message