Return-Path: Delivered-To: apmail-avro-user-archive@www.apache.org Received: (qmail 50314 invoked from network); 30 Mar 2011 18:26:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Mar 2011 18:26:46 -0000 Received: (qmail 35423 invoked by uid 500); 30 Mar 2011 18:26:45 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 35366 invoked by uid 500); 30 Mar 2011 18:26:45 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 35357 invoked by uid 99); 30 Mar 2011 18:26:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Mar 2011 18:26:45 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of scott@richrelevance.com designates 64.78.17.17 as permitted sender) Received: from [64.78.17.17] (HELO EXHUB018-2.exch018.msoutlookonline.net) (64.78.17.17) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Mar 2011 18:26:35 +0000 Received: from EXVMBX018-1.exch018.msoutlookonline.net ([64.78.17.47]) by EXHUB018-2.exch018.msoutlookonline.net ([64.78.17.17]) with mapi; Wed, 30 Mar 2011 11:26:14 -0700 From: Scott Carey To: "user@avro.apache.org" Date: Wed, 30 Mar 2011 11:29:06 -0700 Subject: Re: A strange problem when I am trying to read avro record with a subset of the schema. Thread-Topic: A strange problem when I am trying to read avro record with a subset of the schema. Thread-Index: AcvvB/Nrl9KM00XMT+KMaMZcydXlYA== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.0.101115 acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C9B8C3942DD7Cscottrichrelevancecom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_C9B8C3942DD7Cscottrichrelevancecom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable 1: What version of Avro is this? 2: If you change the schema you write with by making reversing the order o= f the fields of "sdf" (array, then string), are the results the same? This looks like a bug, file a JIRA ticket and if you have a reproducible te= st case or code snippet that reproduces, attach that to the ticket. Thanks! -Scott On 3/30/11 8:49 AM, "Felix Xu" > wrote: Hi, all. When I am trying to read avro file with a subset of that schema(be= cause I do not need all the details).I meet a strange problem. 1.I write data using this schema: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "device", "type": "string" }, { "name": "children", "type": { "type": "array", "items": "string" } } ] } } } } ] } 2.Here is a JSONObject for that schema. { "timestamp":1234567890, "type":{ "WMA":[ { "device":"WMA1", "children":["WMB1","WMB2"] }, { "device":"WMA2", "children":["WMB1","WMB2"] } ] } } 3.I write that record succefully.And it is okay if I use this schema for re= ading: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "children", "type": { "type": "array", "items": "string" } } ] } } } } ] } the result is : { "timestamp":1234567890, "type":{ "WMA":[ { "children":["WMB1","WMB2"] }, { "children":["WMB1","WMB2"] } ] } } 4.But if i want to igonre the "children" part instead of "device", I use t= his schema for reading: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "device", "type": "string" } ] } } } } ] } Unfortunately,I get exception: java.lang.ArrayIndexOutOfBoundsException: -8 cause:java.lang.ArrayIndexOutOfBoundsException at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122) at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262) at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:1= 13) at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60) at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71) at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83) at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:19= 5) at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70) at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71) at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java= :127) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader= .java:162) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 138) at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.= java:196) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 140) at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.ja= va:233) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 141) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader= .java:167) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 129) at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236) at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223) at AvroUtilTest.read(AvroUtilTest.java:77) at AvroUtilTest.main(AvroUtilTest.java:61) --_000_C9B8C3942DD7Cscottrichrelevancecom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
1: What version of Avro = is this?
2:  If you change the schema you write with by maki= ng reversing the order of the fields of "sdf" (array, then string), are the= results the same?

This looks like a bug, file a J= IRA ticket and if you have a reproducible test case or code snippet that re= produces, attach that to the ticket.

Thanks!
=

-Scott 

On 3/30/11 8:49 AM, "Felix Xu" <ygnhzeus@gmail.com> wrote:

<= /div>
Hi, all. When I am trying to read avro fil= e with a subset of that schema(because I do not need all the details).I mee= t a strange problem.
1.I write data using this schema:
{
    "name": "relation",
    "type"= : "record",
    "fields": [
  =      {
           &= nbsp;"name": "timestamp",
          = ;  "type": "long"
        },
<= div>        {
      =      "name": "type",
       &= nbsp;    "type": {
         &n= bsp;      "type": "map",
      = ;          "values":{
    = ;                "type" : "array",<= /div>
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "device",
"type": "stri= ng"
},
{
<= div> "name": "children",
"type": {
"type": "arr= ay",
"ite= ms": "string"
}
<= span class=3D"Apple-tab-span" style=3D"white-space:pre"> }
]
<= div> }
}
&n= bsp;           }
    = ;    }
    ]
}

2.Here i= s a JSONObject for that schema.
{
"timestamp":1234567= 890,
"type":{
"WMA":[
{
"device":"WMA1",
"children":["WMB1","WMB= 2"]
<= /span>},
{
"device":"WMA2",
"children":["WMB1","WMB2"]
}
]
=
}

}

3.I write that record succefully.And it is o= kay if I use this schema for reading:
{
    "name": "relation",
    "type= ": "record",
    "fields": [
  = ;      {
           =  "name": "timestamp",
         &nbs= p;  "type": "long"
        },
=
        {
      = ;      "name": "type",
       =      "type": {
         &= nbsp;      "type": "map",
     &nbs= p;          "values":{
   &nbs= p;                "type" : "array",=
"items": {
"type":"reco= rd",
"nam= e":"sdf",
"fields": [
{
= = "name": "children",
= "type": {
"type": "array",
"items": "string"
}
}
]
}
}
       &= nbsp;    }
        }
    ]
}

the result is :
=
{
"timestamp":1234567890,
"type":{
"WMA":[
{
"children":["WMB1","WMB2"]
},=
{
"children":["WMB= 1","WMB2"]
}
]
}

}

<= /div>
4.But if i wan= t to igonre the "children" part instead of "device",  I use this schem= a for reading:
{
    "n= ame": "relation",
    "type": "record",
=     "fields": [
        {=
            "name": "timestam= p",
            "type": "long"=
        },
    = ;    {
            "= name": "type",
            "ty= pe": {
               &nb= sp;"type": "map",
            =    "values":{
          =          "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
= "name": "device",=
"type": "st= ring"
<= span class=3D"Apple-tab-span" style=3D"white-space: pre; "> }<= /font>
]<= /div>
}
}<= /div>
            }
 =        }
    ]
}

Unfortunately,I get exception:

java.lang.ArrayIndexOutOfBoundsException: -8
cause:java.lang.A= rrayIndexOutOfBoundsException
at org.apache.avro.io.BinaryDecoder.readInt(B= inaryDecoder.java:122)
at org.apache.avro.io.BinaryDecoder.skipString(Binar= yDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.skipString(Valid= atingDecoder.java:113)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(P= arsingDecoder.java:60)
at org.apache.avro.io.parsing.SkipParser.skipTo(Skip= Parser.java:71)
at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipP= arser.java:83)
at org.apache.avro.io.ValidatingDecoder.skipArray(Validating= Decoder.java:195)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(Parsin= gDecoder.java:70)
at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParse= r.java:71)
at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.j= ava:93)
= at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.ja= va:226)
= at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
<= div> at org= .apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)<= /div>
= at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader= .java:162)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumR= eader.java:138)
at org.apache.avro.generic.GenericDatumReader.readArray(Gen= ericDatumReader.java:196)
at org.apache.avro.generic.GenericDatumReader.rea= d(GenericDatumReader.java:140)
at org.apache.avro.generic.GenericDatumReade= r.readMap(GenericDatumReader.java:233)
at org.apache.avro.generic.GenericDa= tumReader.read(GenericDatumReader.java:141)
at org.apache.avro.generic.Gene= ricDatumReader.readRecord(GenericDatumReader.java:167)
at org.apache.avro.g= eneric.GenericDatumReader.read(GenericDatumReader.java:138)
at org.apache.a= vro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
= at org.apa= che.avro.file.DataFileStream.next(DataFileStream.java:236)
at org.apache.av= ro.file.DataFileStream.next(DataFileStream.java:223)
at AvroUtilTest.read(= AvroUtilTest.java:77)
at AvroUtilTest.main(AvroUtilTest.java:61)

--_000_C9B8C3942DD7Cscottrichrelevancecom_--