avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Pigott <mpigott.subscripti...@gmail.com>
Subject Re: How to deserialize avro file with union/many schemas?
Date Wed, 23 Jul 2014 05:25:46 GMT
It's just a regular Union :-)
http://avro.apache.org/docs/1.7.6/spec.html#Unions

Regards,
Mike
On Jul 23, 2014 1:22 AM, "Echo" <echolql@gmail.com> wrote:

> Thanks Mike, it sounds make sense, is there any doc I can read about union
> schema?
>
> On Jul 22, 2014, at 2:32 PM, Michael Pigott <
> mpigott.subscriptions@gmail.com> wrote:
>
> Echo,
>     Just to make sure I understand you correctly - do you have a file with
> multiple Avro datums in it, each one following a separate schema?  And are
> all of these schemas unioned together in a file-level "master schema?"  (As
> far as I know, Avro file readers and writers only support one schema per
> file, so this is the only way your question makes sense to me.)
>     If that's the case, then you can get the file's "master schema" and
> determine what all of the different types are:
>
> List<Schema> allTypes = masterSchema.getTypes(); // Assumes masterSchema
> is of Type.UNION
>
> Then when you read each Avro datum in the file, you can check which of the
> schemas it conforms to, and write a new file with just that sub-schema and
> the one datum in it.
>
> Does that make sense?
> Mike
>
>
> On Tue, Jul 22, 2014 at 3:22 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> For the purpose of others on this list, can ytou please provide an
>> example of your schema?
>> Thanks
>> Lewis
>>
>>
>> On Tue, Jul 22, 2014 at 12:06 PM, Echo Li <echolql@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm new here, hope I can get help from you guys. Basically I have an
>>> avro file with union/many schemas and mixed records. I will need to split
>>> it to many avro file, one schema per file. All the stuff I've been reading
>>> is about serializing and deserializing avro file with one schema, which is
>>> pretty straightforward, but in my case I have no clue, any ideas?
>>>
>>
>>
>>
>> --
>> *Lewis*
>>
>
>

Mime
View raw message