avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Schierbeck <daniel.schierb...@gmail.com>
Subject Re: Using Avro for encoding messages
Date Thu, 09 Jul 2015 13:21:01 GMT
The Confluent tools seem to be very oriented towards a Java-heavy
infrastructure, and I'd rather not have to re-implement all their somewhat
complex tooling in Ruby and Go. I'd much prefer a simplified model that can
more easily be implemented.
As an aside, Confluent *could* support such a standard by using a custom
"fingerprint type" that's just their id number.

On Thu, Jul 9, 2015 at 2:21 PM Svante Karlsson <svante.karlsson@csi.se>

> >> What causes the schema normalization to be incomplete?
> Bad implementation, I use C++ avro and it's not complete and not very
> active.
> >And is that a problem? As long as the reader can get the schema, it
> shouldn't matter that there are duplicates – as long as the >differences
> between the duplicates do not affect decoding.
> Not really a problem, we tend to use machine generated schemas and they
> are always identical.
> I think there are holes in the simplification of types if I remember
> correctly.
> Namespaces should be collapsed,
> {"type" : "string"} -> "string" etc
> Current implementation can't reliably decide if two types are identical.
> If you correct the problem later then a registered schema would actually
> change it's hash since it now can be simplified. If this is a problem
> depends on your application.
> We currently encode this as you suggest <schema_type (byte)><schema_id
> (32/128bits)><avro (binary)>
> The binary fields should probably have a defined endianness also.
> I agree on that a defacto way of encoding this would be nice. Currently I
> would say that the confluent / linkedin way is the normal....

View raw message