avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Kleppmann <mkleppm...@linkedin.com>
Subject Re: Catch 22 when obtaining Fields and Objects
Date Mon, 31 Mar 2014 15:53:44 GMT
Hi Lewis,

Are you trying to avoid transferring unnecessary fields over the network? In that case you'd
have to break the schema up into its individual fields, and serialize each individually. However,
it's not clear to me whether this would be much of a performance advantage (it would probably
depend on the data store's API).

Or are you ok with transferring the entire record over the network, but just want to avoid
parsing fields that you don't need? In that case you can use a reader's schema that includes
only the fields that you need, and the Avro parser will skip over all fields that are not
mentioned in the reader's schema.


On 30 Mar 2014, at 19:57, Lewis John Mcgibbney <lewis.mcgibbney@gmail.com<mailto:lewis.mcgibbney@gmail.com>>
Hi Folks,
Right now over in Gora [0] we write data down into Byte[] before persisting an object into
a back end datastore.
We use Avro for our serialization.
The question I would like to pose is as follows

In Gora we can do a get on objects as follows

public T get(K key, String[] fields)

If no field arguments are provided then we query ALL fields.

If however we query for say two string fields "name" and "age" we still need to obtain Field's
for the entire object (as they are stored as Byte[]) then sort things out on our end.

Is there a better way we could/should be doing this?

For example, in our gora-dyhamodb store, we simply put objects in their native types and we
let DynamoDB deal with the best way to serde the data. We are looking to simulate this across
all supported data stores therefore some discussion from this list would be excellent in enabling
us to make a more informed decision.
Thanks in advance.
[0] http://gora.apache.org<http://gora.apache.org/>


View raw message