avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <tsalora...@gmail.com>
Subject Re: question about completely untagged data...
Date Mon, 29 Nov 2010 18:04:38 GMT
On Sun, Nov 28, 2010 at 6:39 PM, David Jeske <davidj@gmail.com> wrote:
> I have a storage project considering adding Thrift or Avro to for record
> packing, and I have a couple questions.
> Other than than type-id and field-ids, Avro and Thrift's designs seem
> isomorphic. Is the binary format not including field-type-info something
> that's set in stone, or something that's open for feedback?
> Going the thrift route for me will mean injecting a bit of the Avro
> philosophy into Thrift, namely, adding a Thrift IDL parser to the language I
> need, so I can save Thrift IDLs and then dynamically read them. However,
> doing this as a one-off for my language different than having a supported
> mechanism for all client languages -- like in Avro.

If you really want to keep bit more of descriptive information, you
could also just consider formats that do include property names, like
JSON (with compression).
Depending on exactly what you plan to store, it might be a competitive
choice all around.

I don't think either Avro or Thrift is actually aimed so much for
storing data as for transferring data; since the issue of persisting
schemas does complicate things significantly (same is true with
protobuf too, just even more so). And Avro specifically seems like
best fit for sequences of homogenous data entries (rows of DB, log
entries etc). This may or may not be similar to your use case.
But maybe there are other reasons why you have limited choice to just
these two formats?

-+ Tatu +-

View raw message