avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Xu <ygnhz...@gmail.com>
Subject Re: A strange problem when I am trying to read avro record with a subset of the schema.
Date Fri, 01 Apr 2011 07:03:07 GMT
Yes...I submitted that.

2011/4/1 Scott Carey <scott@richrelevance.com>

> FYI for those not on the avro-dev mailing list, there is a related JIRA
> now:
> https://issues.apache.org/jira/browse/AVRO-793
>
> On 3/30/11 7:45 PM, "Felix Xu" <ygnhzeus@gmail.com> wrote:
>
> Okay,is there any jira topic related to this problem?
> My avro version is 1.5.0.
>
> 2011/3/31 Scott Carey <scott@richrelevance.com>
>
>> There was a bug at some point in schema resolution where dropping the last
>> field of a record caused a problem.  Its possible that either:
>>
>> You are using a version where this isn't fixed.
>> Or
>> The fix did not work for array types
>>
>> On 3/30/11 7:17 PM, "Felix Xu" <ygnhzeus@gmail.com> wrote:
>>
>> Wow,it's amazing.
>> I did #2 and it worked.
>> What's the problem?How to fix it?
>>
>> 2011/3/31 Scott Carey <scott@richrelevance.com>
>>
>>> 1: What version of Avro is this?
>>> 2:  If you change the schema you write with by making reversing the order
>>> of the fields of "sdf" (array, then string), are the results the same?
>>>
>>> This looks like a bug, file a JIRA ticket and if you have a reproducible
>>> test case or code snippet that reproduces, attach that to the ticket.
>>>
>>> Thanks!
>>>
>>> -Scott
>>>
>>> On 3/30/11 8:49 AM, "Felix Xu" <ygnhzeus@gmail.com> wrote:
>>>
>>> Hi, all. When I am trying to read avro file with a subset of that
>>> schema(because I do not need all the details).I meet a strange problem.
>>> 1.I write data using this schema:
>>> {
>>>     "name": "relation",
>>>     "type": "record",
>>>     "fields": [
>>>         {
>>>             "name": "timestamp",
>>>             "type": "long"
>>>         },
>>>         {
>>>             "name": "type",
>>>             "type": {
>>>                 "type": "map",
>>>                 "values":{
>>>                     "type" : "array",
>>> "items": {
>>> "type":"record",
>>> "name":"sdf",
>>> "fields": [
>>> {
>>> "name": "device",
>>> "type": "string"
>>> },
>>> {
>>> "name": "children",
>>> "type": {
>>> "type": "array",
>>> "items": "string"
>>> }
>>> }
>>> ]
>>> }
>>> }
>>>             }
>>>         }
>>>     ]
>>> }
>>>
>>> 2.Here is a JSONObject for that schema.
>>> {
>>> "timestamp":1234567890,
>>> "type":{
>>> "WMA":[
>>> {
>>> "device":"WMA1",
>>> "children":["WMB1","WMB2"]
>>> },
>>> {
>>> "device":"WMA2",
>>> "children":["WMB1","WMB2"]
>>> }
>>> ]
>>> }
>>>
>>> }
>>>
>>> 3.I write that record succefully.And it is okay if I use this schema for
>>> reading:
>>> {
>>>     "name": "relation",
>>>     "type": "record",
>>>     "fields": [
>>>         {
>>>             "name": "timestamp",
>>>             "type": "long"
>>>         },
>>>         {
>>>             "name": "type",
>>>             "type": {
>>>                 "type": "map",
>>>                 "values":{
>>>                     "type" : "array",
>>> "items": {
>>> "type":"record",
>>> "name":"sdf",
>>> "fields": [
>>> {
>>> "name": "children",
>>> "type": {
>>> "type": "array",
>>> "items": "string"
>>> }
>>> }
>>> ]
>>> }
>>> }
>>>             }
>>>         }
>>>     ]
>>> }
>>>
>>> the result is :
>>> {
>>> "timestamp":1234567890,
>>> "type":{
>>> "WMA":[
>>> {
>>> "children":["WMB1","WMB2"]
>>> },
>>> {
>>> "children":["WMB1","WMB2"]
>>> }
>>> ]
>>> }
>>>
>>> }
>>>
>>> 4.But if i want to igonre the "children" part instead of "device",  I use
>>> this schema for reading:
>>> {
>>>     "name": "relation",
>>>     "type": "record",
>>>     "fields": [
>>>         {
>>>             "name": "timestamp",
>>>             "type": "long"
>>>         },
>>>         {
>>>             "name": "type",
>>>             "type": {
>>>                 "type": "map",
>>>                 "values":{
>>>                     "type" : "array",
>>> "items": {
>>> "type":"record",
>>> "name":"sdf",
>>> "fields": [
>>> {
>>> "name": "device",
>>> "type": "string"
>>> }
>>> ]
>>> }
>>> }
>>>             }
>>>         }
>>>     ]
>>> }
>>>
>>> Unfortunately,I get exception:
>>>
>>> java.lang.ArrayIndexOutOfBoundsException: -8
>>> cause:java.lang.ArrayIndexOutOfBoundsException
>>> at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
>>> at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
>>> at
>>> org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
>>> at
>>> org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
>>> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
>>> at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
>>> at
>>> org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
>>> at
>>> org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
>>> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
>>> at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
>>> at
>>> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
>>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>> at
>>> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>>> at
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
>>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
>>> at AvroUtilTest.read(AvroUtilTest.java:77)
>>> at AvroUtilTest.main(AvroUtilTest.java:61)
>>>
>>>
>>
>

Mime
View raw message