avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Amstutz <peter.amst...@curoverse.com>
Subject Re: handling fields with "any" structure
Date Fri, 28 Aug 2015 14:15:57 GMT
Thanks!  I'm not familiar with logical types (sounds like this is a
new, unreleased feature?) so I'll have to look into it.

- Peter

On Fri, Aug 28, 2015 at 10:09 AM, Farkas, Zoltan
<Zoltan.Farkas@pimco.com> wrote:
> Hi Peter,
> I have recently implemented this with the logicalType concept introduced recently in
> (I have my own fork (https://github.com/zolyfarkas/avro ) that I use until I find some
time to merge Ryan's implementation, but I have other improvements that I rely on like idl
forward declarations, improved json encoding...)
> Here is how I implemented the any type:
>     /** a unknown serialized java object */
>     @logicalType("unknown")
>     record Unknown {
>         /** maven schema ID (optional for future extension, with different ID types)
>         union {null, MavenSchemaId} mavenSchemaId = null;
>         /** the avro serialized object */
>         union {null, string, bytes} serObj;
>     }
> The maven schema ID contains enough info to retrieve the schema that the record is serialized
into.(the serObj field).
> In my case I store all schemas in a maven repo, and my MavenSchemaId looks like:
>     /** A maven artifact ID */
>     record MavenArtifactId {
>         /** The maven group id */
>         string groupId;
>         /** The maven artifactId */
>         string artifactId;
>         /** The schema version */
>         string version;
>    }
>    /** A maven schema ID*/
>    record MavenSchemaId {
>         /** The maven artifact */
>         MavenArtifactId artifactId;
>         /** The record name (namespace + name) */
>         string recordName;
>     }
> But a schemaID can really be anything, (a number, a string...), as long as you have a
system/service to resolve it. You can even put the schema in the Unknown record if that works
for you...
> So every time I need a "Any"(Unknown) field I use it like:
> Import idl "common.avdl"
> record  MyRecord {
> ...
> Unknown any;
> ...
> }
> The generated DTOs set and get an Object (just like unions), when you deseralize you
will get either a SpecificRecord (if you have a generated DTO..) or a GenericRecord...
> Let me know if you have any questions...
> (would be interested to know if you encounter any issues implementing this with the official
avro logical type implementation...)
> cheers
> --Z
> -----Original Message-----
> From: Peter Amstutz [mailto:peter.amstutz@curoverse.com]
> Sent: Friday, August 28, 2015 6:26 AM
> To: user@avro.apache.org
> Subject: handling fields with "any" structure
> Hello everyone,
> I am using Avro to load and validate JSON documents.  Mostly this works very well and
it is straightforward to express the structure of my document using Avro schema. However,
I have a few fields which can have "any" content.  It is impossible to declare all possible
structures in advance, and I can't use a union type of primitives because the fields may also
contain complex types (nested lists/maps) and Avro doesn't allow named unions.
> So far as I have been able to determine, this is impossible with standard Avro schema,
so I am curious if anyone else has dealt with this problem and can suggest any workarounds.
 Currently my best (least bad) idea is to preprocess the JSON to pull out the "any"
> fields and store them on the side before handing the document to Avro for loading.  This
is awkward so I would love to hear if anyone has any other ideas.
> Thanks,
> Peter
> This message contains confidential information and is intended only for the individual
named. If you are not the named addressee, you should not disseminate, distribute, alter or
copy this e-mail. Please notify the sender immediately by e-mail if you have received this
e-mail by mistake and delete this e-mail from your system. E-mail transmissions cannot be
guaranteed to be secure or without error as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The sender, therefore, does not
accept liability for any errors or omissions in the contents of this message which arise during
or as a result of e-mail transmission. If verification is required, please request a hard-copy
version. This message is provided for information purposes and should not be construed as
a solicitation or offer to buy or sell any securities or related financial instruments in
any jurisdiction.  Securities are offered in the U.S. through PIMCO Investments LLC, distributor
and a company of PIMCO LLC.

View raw message