avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thiruvalluvan M. G. (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-793) A strange problem when I am trying to read avro record with a subset of the schema.
Date Wed, 27 Apr 2011 11:08:03 GMT

     [ https://issues.apache.org/jira/browse/AVRO-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thiruvalluvan M. G. updated AVRO-793:
-------------------------------------

    Attachment: AVRO-793-test.patch
                AVRO-793.patch

Very subtle bug. If there is an array needs to be skipped and that happens to be the last
field of a record, and the record is contained in an outer array, it does not get skipped
properly.

The test patch has the test that catches the bug and the main patch has the solution.

> A strange problem when I am trying to read avro record with a subset of the schema.
> -----------------------------------------------------------------------------------
>
>                 Key: AVRO-793
>                 URL: https://issues.apache.org/jira/browse/AVRO-793
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.0
>         Environment: Avro1.5,Windows xp/Ubuntu 10.0.4
>            Reporter: Yingzhong Xu
>            Assignee: Thiruvalluvan M. G.
>            Priority: Critical
>              Labels: Avro, Reading, Schema, Write
>             Fix For: 1.5.1
>
>         Attachments: AVRO-793-test.patch, AVRO-793.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Hi, all. When I am trying to read avro file with a subset of that schema(because I do
not need all the details).I meet a strange problem.
> 1.I write data using this schema:
> {
>     "name": "relation",
>     "type": "record",
>     "fields": [
>         {
>             "name": "timestamp",
>             "type": "long"
>         },
>         {
>             "name": "type",
>             "type": {
>                 "type": "map",
>                 "values":{
>                     "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "device",
> "type": "string"
> },
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": "string"
> }
> }
> ]
> }
> }
>             }
>         }
>     ]
> }
> 2.Here is a JSONObject for that schema.
> {
> "timestamp":1234567890,
> "type":{
> "WMA":[
> {
> "device":"WMA1",
> "children":["WMB1","WMB2"]
> },
> {
> "device":"WMA2",
> "children":["WMB1","WMB2"]
> }
> ]
> }
> }
> 3.I write that record succefully.And it is okay if I use this schema for reading:
> {
>     "name": "relation",
>     "type": "record",
>     "fields": [
>         {
>             "name": "timestamp",
>             "type": "long"
>         },
>         {
>             "name": "type",
>             "type": {
>                 "type": "map",
>                 "values":{
>                     "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": "string"
> }
> }
> ]
> }
> }
>             }
>         }
>     ]
> }
> the result is :
> {
> "timestamp":1234567890,
> "type":{
> "WMA":[
> {
> "children":["WMB1","WMB2"]
> },
> {
> "children":["WMB1","WMB2"]
> }
> ]
> }
> }
> 4.But if i want to igonre the "children" part instead of "device",  I use this schema
for reading:
> {
>     "name": "relation",
>     "type": "record",
>     "fields": [
>         {
>             "name": "timestamp",
>             "type": "long"
>         },
>         {
>             "name": "type",
>             "type": {
>                 "type": "map",
>                 "values":{
>                     "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "device",
> "type": "string"
> }
> ]
> }
> }
>             }
>         }
>     ]
> }
> Unfortunately,I get exception:
> java.lang.ArrayIndexOutOfBoundsException: -8
> cause:java.lang.ArrayIndexOutOfBoundsException
> at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
> at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
> at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
> at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
> at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
> at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
> at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
> at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
> at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
> at AvroUtilTest.read(AvroUtilTest.java:77)
> at AvroUtilTest.main(AvroUtilTest.java:61)
> As Scott Carey said,I did like this and it worked.How to fix this bug?
> Scott Carey:
> 2:  If you change the schema you write with by making reversing the order of the fields
of "sdf" (array, then string), are the results the same?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message