avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Echo Li <echo...@gmail.com>
Subject Re: How to deserialize avro file with union/many schemas?
Date Thu, 24 Jul 2014 02:50:45 GMT
thanks Sachin,

My schema more like:
[ { schema-one with type="record"}{schema-two with type="record"}...]

and followed by datums and each pertaining to one of the schemas, and each
schema will map to one class.




On Wed, Jul 23, 2014 at 3:42 PM, Sachin Goyal <sgoyal@walmartlabs.com>
wrote:

>
> To see a union schema, do the following:
> System.out.println
> (ReflectData.AllowNull.get().getSchema(YourClass.class));
>
> And then do the following:
> System.out.println (ReflectData.get().getSchema(YourClass.class));
>
> Diff the two outputs.
> First one generates a UNION of each and every field with a null.
>
> Hope that helps.
> Sachin
>
>
> From: Echo Li <echolql@gmail.com<mailto:echolql@gmail.com>>
> Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" <
> user@avro.apache.org<mailto:user@avro.apache.org>>
> Date: Wednesday, July 23, 2014 at 3:09 PM
> To: "user@avro.apache.org<mailto:user@avro.apache.org>" <
> user@avro.apache.org<mailto:user@avro.apache.org>>
> Subject: Re: How to deserialize avro file with union/many schemas?
>
> Hi Mike,
>
> I read through most of the doc on avro site, don't see anything about the
> "union schema", Mike, would you mind give me some example here how the
> union schma is defined? also what package/method can retrieve the master
> schema from avro file? is that "getschema()" should work? and how to read
> in each Avro datums whithout knowing their corresponding schema?....
>
> very much appreciate your help!
>
>
> On Tue, Jul 22, 2014 at 10:25 PM, Michael Pigott <
> mpigott.subscriptions@gmail.com<mailto:mpigott.subscriptions@gmail.com>>
> wrote:
>
> It's just a regular Union :-)
> http://avro.apache.org/docs/1.7.6/spec.html#Unions
>
> Regards,
> Mike
>
> On Jul 23, 2014 1:22 AM, "Echo" <echolql@gmail.com<mailto:
> echolql@gmail.com>> wrote:
> Thanks Mike, it sounds make sense, is there any doc I can read about union
> schema?
>
> On Jul 22, 2014, at 2:32 PM, Michael Pigott <
> mpigott.subscriptions@gmail.com<mailto:mpigott.subscriptions@gmail.com>>
> wrote:
>
> Echo,
>     Just to make sure I understand you correctly - do you have a file with
> multiple Avro datums in it, each one following a separate schema?  And are
> all of these schemas unioned together in a file-level "master schema?"  (As
> far as I know, Avro file readers and writers only support one schema per
> file, so this is the only way your question makes sense to me.)
>     If that's the case, then you can get the file's "master schema" and
> determine what all of the different types are:
>
> List<Schema> allTypes = masterSchema.getTypes(); // Assumes masterSchema
> is of Type.UNION
>
> Then when you read each Avro datum in the file, you can check which of the
> schemas it conforms to, and write a new file with just that sub-schema and
> the one datum in it.
>
> Does that make sense?
> Mike
>
>
> On Tue, Jul 22, 2014 at 3:22 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com<mailto:lewis.mcgibbney@gmail.com>> wrote:
> For the purpose of others on this list, can ytou please provide an example
> of your schema?
> Thanks
> Lewis
>
>
> On Tue, Jul 22, 2014 at 12:06 PM, Echo Li <echolql@gmail.com<mailto:
> echolql@gmail.com>> wrote:
> Hello,
>
> I'm new here, hope I can get help from you guys. Basically I have an avro
> file with union/many schemas and mixed records. I will need to split it to
> many avro file, one schema per file. All the stuff I've been reading is
> about serializing and deserializing avro file with one schema, which is
> pretty straightforward, but in my case I have no clue, any ideas?
>
>
>
> --
> Lewis
>
>
>

Mime
View raw message