avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: A strange problem when I am trying to read avro record with a subset of the schema.
Date Thu, 31 Mar 2011 16:37:07 GMT
FYI for those not on the avro-dev mailing list, there is a related JIRA now:
https://issues.apache.org/jira/browse/AVRO-793

On 3/30/11 7:45 PM, "Felix Xu" <ygnhzeus@gmail.com<mailto:ygnhzeus@gmail.com>>
wrote:

Okay,is there any jira topic related to this problem?
My avro version is 1.5.0.

2011/3/31 Scott Carey <scott@richrelevance.com<mailto:scott@richrelevance.com>>
There was a bug at some point in schema resolution where dropping the last field of a record
caused a problem.  Its possible that either:

You are using a version where this isn't fixed.
Or
The fix did not work for array types

On 3/30/11 7:17 PM, "Felix Xu" <ygnhzeus@gmail.com<mailto:ygnhzeus@gmail.com>>
wrote:

Wow,it's amazing.
I did #2 and it worked.
What's the problem?How to fix it?

2011/3/31 Scott Carey <scott@richrelevance.com<mailto:scott@richrelevance.com>>
1: What version of Avro is this?
2:  If you change the schema you write with by making reversing the order of the fields of
"sdf" (array, then string), are the results the same?

This looks like a bug, file a JIRA ticket and if you have a reproducible test case or code
snippet that reproduces, attach that to the ticket.

Thanks!

-Scott

On 3/30/11 8:49 AM, "Felix Xu" <ygnhzeus@gmail.com<mailto:ygnhzeus@gmail.com>>
wrote:

Hi, all. When I am trying to read avro file with a subset of that schema(because I do not
need all the details).I meet a strange problem.
1.I write data using this schema:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "device",
"type": "string"
},
{
"name": "children",
"type": {
"type": "array",
"items": "string"
}
}
]
}
}
            }
        }
    ]
}

2.Here is a JSONObject for that schema.
{
"timestamp":1234567890,
"type":{
"WMA":[
{
"device":"WMA1",
"children":["WMB1","WMB2"]
},
{
"device":"WMA2",
"children":["WMB1","WMB2"]
}
]
}

}

3.I write that record succefully.And it is okay if I use this schema for reading:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "children",
"type": {
"type": "array",
"items": "string"
}
}
]
}
}
            }
        }
    ]
}

the result is :
{
"timestamp":1234567890,
"type":{
"WMA":[
{
"children":["WMB1","WMB2"]
},
{
"children":["WMB1","WMB2"]
}
]
}

}

4.But if i want to igonre the "children" part instead of "device",  I use this schema for
reading:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "device",
"type": "string"
}
]
}
}
            }
        }
    ]
}

Unfortunately,I get exception:

java.lang.ArrayIndexOutOfBoundsException: -8
cause:java.lang.ArrayIndexOutOfBoundsException
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
at AvroUtilTest.read(AvroUtilTest.java:77)
at AvroUtilTest.main(AvroUtilTest.java:61)




Mime
View raw message