avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <scottca...@apache.org>
Subject Re: Question related to org.apache.avro.io.DirectBinaryDecoder
Date Mon, 25 Jan 2016 20:08:44 GMT
Looking at the ticket (sorry, can¹t update in JIRA at the moment):

The value 101 in the raw data is the integer -51.

Therefore the cause is either:

* Corrupt data
* Improper schema used to read (was written with a different schema than
the reader is configured to use for its Œwriter¹ schema)

You might want to Œlook around¹ the bytes near that 101 and see if the
data looks like it is from a different schema than expected.

Given that some other tools can read it, it is likely the latter case ‹
the other tools are reading it with a different schema.

Note, a reader requires _two_ schemas:  The schema that the reader wants
to interpret the data as, and the schema that was used when the data was
written.  If the latter is wrong, this sort of thing can happen as Avro
tries to read a variable length item from data that is in the wrong

You could also see if BinaryDecoder behaves any differently from
DirectBinaryDecoder.  The issue is most likely above those ‹in the code
that uses these (the resolver and/or DatumReader).


On 1/23/16, 10:59 AM, "Yong Zhang" <java8964@hotmail.com> wrote:

>Hi, Avro Developers:
>Is anyone familiar the code logic related to
>I am asking this question related to AVRO-1786, which I believe I am
>facing a bug related to this class.
>A valid Avro record sent from Mapper to the Reducer, but Reducer cannot
>read it due to IndexOutOfBoundException, because the readInt() method of
>this class return "-51".
>I even can dump the local variables of the method in this exception case,
>and described in the comments area of Jira ticket.
>I don't understand the internal logic of this class, and how the
>readInt() method implemented. But an inputstream read 101 bytes out will
>cause this method return a negative number, and causes following method
>IndexOutofBoundException looks like a bug to me.
>Can anyone understand this class's logic confirm this is a bug or not? If
>it is a bug, what is the best way to fix it?
>I can consistently reproduce this bug on our production cluster, which
>mean I can verify any code fix works or not for this case.
>Let me know any question related to this JIRA.

View raw message