avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shirahatti, Nikhil" <snik...@telenav.com>
Subject Re: Nesting avro with avro or proto binary represenations
Date Thu, 16 Feb 2012 21:08:52 GMT
Thanks Doug and Scott. I think this answers my question if it can be done. Now, Is there a
template or pattern as to how to do it ? I see two strategies as discussed below:

 1.  Array of items -> items come from the body structure
 2.  bytes: a serialization of the body based on a type of serialization

Nikhil

From: Scott Carey <scottcarey@apache.org<mailto:scottcarey@apache.org>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" <user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Thu, 16 Feb 2012 12:06:51 -0800
To: "user@avro.apache.org<mailto:user@avro.apache.org>" <user@avro.apache.org<mailto:user@avro.apache.org>>
Subject: Re: Nesting avro with avro or proto binary represenations



On 2/15/12 8:23 PM, "Shirahatti, Nikhil" <snikhil@telenav.com<mailto:snikhil@telenav.com>>
wrote:

Hello Avro Users,

My question is whether we can use an avro schema as a wrapper for another avro/protobuf binary
representation.

Example:

{

      "namespace": "com.AvroExample",

      "name": "wrapper",

      "type": "record",

      "fields": [

          {"name": "timestamp", "type": "long"},

          {"name": "header", "type": "string"},

          {"name": "body", "type": "bytes"} ]

}

If you wish to save space, you could use an Enum for the header provided it was only indicating
what type the binary is.

Or, you could go even further and use a record for each type, and the wrapper would be a timestamp
and a union of the binary types.



Then the body can be filled in with the binary representation (avro/protobuf/json). Can we
wrap the below avro schema being inside the above wrapper schema? If so any pointers for it?

{

      "namespace": "com.AvroExample",

      "name": "server",

      "type": "record",

      "fields": [

             { "name" : "status", "type": "string"},

             { "name" : "user", "type": "string"}]

}

Since you want to have the internal binary be wrapped and optionally be json, avro, or protobuf,
you will probably have code that looks something like the below pseudo-code:

DatumReader wrapperReader = <create a datum reader with your chosen api (specific, generic,
reflect if Java) to be cached and used to read the wrapper>
Wrapper wrapper = wrapperReader.read(<from the input>);
InnerReader inner = getReaderFor(wrapper.getHeader());  // extracts the type from the wrapper
and figures out if it is avro, protobuf, etc.  This could be based on a string or enum.
inner.read(wrapper.getBody()); // passes the body to the inner reader

The write would be similar.

If you used the enum approach, then on the read and write avro would take care of determining
what type the body is, but you would still need to have separate implementations for reading
the body.




Thanks,

Nikhil

Mime
View raw message