avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Kenworthy <adwkenwor...@yahoo.com>
Subject Re: How does Avro mark (string) field delimition?
Date Mon, 23 Jan 2012 15:35:22 GMT
I don't have a specific use-class that is problematic, but was trying to understand how it
all works internally. Following your comment about indexes I looked in GenericDatumWriter
and sure enough the union is "tagged" so we know which part of the union was written:

case UNION:
        int index = data.resolveUnion(schema, datum);
        write(schema.getTypes().get(index), datum, out);

That's the bit I was missing! Thanks for the input.


> From: Harsh J <harsh@cloudera.com>
>To: user@avro.apache.org; Andrew Kenworthy <adwkenworthy@yahoo.com> 
>Sent: Monday, January 23, 2012 4:04 PM
>Subject: Re: How does Avro mark (string) field delimition?
>The read part is empty as well, when the decoder is asked to read a
>'null' type. For null carrying unions, I believe an index is written
>out so if the index evals to a null, the same logic works yet again.
>Does not matter if there are two nulls adjacent to one another,
>therefore. How do you imagine this ends up being a problem? What
>trouble are you running into?
>On Mon, Jan 23, 2012 at 8:08 PM, Andrew Kenworthy
><adwkenworthy@yahoo.com> wrote:
>> I have looked at the Avro 1.6.0 code and am not sure how Avro distinguishes
>> between field boundaries when reading null values.
>> The BinaryEncoder class (which is where I land when debugging my code) has
>> an empty method for writeNull: how does the parser then distinguuish between
>> adjacent nullable fields when reading that data?
>> Thanks in advance,
>> Andrew
>Harsh J
>Customer Ops. Engineer, Cloudera
View raw message