avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Holmes <grep.a...@gmail.com>
Subject Re: Avro versioning and SpecificDatum's
Date Wed, 21 Sep 2011 23:55:26 GMT
Thanks, that fixed my issue.

On Tue, Sep 20, 2011 at 2:51 PM, Scott Carey <scottcarey@apache.org> wrote:
> As Doug mentioned in the ticket, the problem is likely:
>
> new SpecificDatumReader<Record>()
>
>
> This should be
>
> new SpecificDatumReader<Record>(Record.class)
>
>
> Which sets the reader to resolve to the schema found in Record.class
>
>
>
> On 9/20/11 3:44 AM, "Alex Holmes" <grep.alex@gmail.com> wrote:
>
>>Created the following ticket:
>>
>>https://issues.apache.org/jira/browse/AVRO-891
>>
>>Thanks,
>>Alex
>>
>>On Tue, Sep 20, 2011 at 6:26 AM, Alex Holmes <grep.alex@gmail.com> wrote:
>>> Thanks, I'll add a bug.
>>>
>>> As a FYI, even without the alias (retaining the original field name),
>>> just removing the "id" field yields the exception.
>>>
>>> On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey <scottcarey@apache.org>
>>>wrote:
>>>> That looks like a bug.  What happens if there is no aliasing/renaming
>>>> involved?  Aliasing is a newer feature than field addition, removal,
>>>>and
>>>> promotion.
>>>>
>>>> This should be easy to reproduce, can you file a JIRA ticket?  We
>>>>should
>>>> discuss this further there.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> On 9/19/11 6:14 PM, "Alex Holmes" <grep.alex@gmail.com> wrote:
>>>>
>>>>>OK, I was able to reproduce the exception.
>>>>>
>>>>>v1:
>>>>>{"name": "Record", "type": "record",
>>>>>  "fields": [
>>>>>    {"name": "name", "type": "string"},
>>>>>    {"name": "id", "type": "int"}
>>>>>  ]
>>>>>}
>>>>>
>>>>>v2:
>>>>>{"name": "Record", "type": "record",
>>>>>  "fields": [
>>>>>    {"name": "name_rename", "type": "string", "aliases": ["name"]}
>>>>>  ]
>>>>>}
>>>>>
>>>>>Step 1.  Write Avro file using v1 generated class
>>>>>Step 2.  Read Avro file using v2 generated class
>>>>>
>>>>>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad
>>>>>index
>>>>>       at Record.put(Unknown Source)
>>>>>       at
>>>>>org.apache.avro.generic.GenericData.setField(GenericData.java:463)
>>>>>       at
>>>>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReade
>>>>>r.j
>>>>>ava:166)
>>>>>       at
>>>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java
>>>>>:13
>>>>>8)
>>>>>       at
>>>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java
>>>>>:12
>>>>>9)
>>>>>       at
>>>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>>>>>       at
>>>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>>>>>       at Read.readFromAvro(Unknown Source)
>>>>>       at Read.main(Unknown Source)
>>>>>
>>>>>The code to write/read the avro file didn't change from below.
>>>>>
>>>>>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes <grep.alex@gmail.com>
>>>>>wrote:
>>>>>> I'm trying to put together a simple test case to reproduce the
>>>>>> exception.  While I was creating the test case, I hit this behavior
>>>>>> which doesn't seem right, but maybe it's my misunderstanding on how
>>>>>> forward/backward compatibility should work:
>>>>>>
>>>>>> Schema v1:
>>>>>>
>>>>>> {"name": "Record", "type": "record",
>>>>>>  "fields": [
>>>>>>    {"name": "name", "type": "string"},
>>>>>>    {"name": "id", "type": "int"}
>>>>>>  ]
>>>>>> }
>>>>>>
>>>>>> Schema v2:
>>>>>>
>>>>>> {"name": "Record", "type": "record",
>>>>>>  "fields": [
>>>>>>    {"name": "name_rename", "type": "string", "aliases": ["name"]},
>>>>>>    {"name": "new_field", "type": "int", "default":"0"}
>>>>>>  ]
>>>>>> }
>>>>>>
>>>>>> In the 2nd version I:
>>>>>>
>>>>>> - removed field "id"
>>>>>> - renamed field "name" to "name_rename"
>>>>>> - added field "new_field"
>>>>>>
>>>>>> I write the v1 data file:
>>>>>>
>>>>>>  public static Record createRecord(String name, int id) {
>>>>>>    Record record = new Record();
>>>>>>    record.name = name;
>>>>>>    record.id = id;
>>>>>>    return record;
>>>>>>  }
>>>>>>
>>>>>>  public static void writeToAvro(OutputStream outputStream)
>>>>>>      throws IOException {
>>>>>>    DataFileWriter<Record> writer =
>>>>>>        new DataFileWriter<Record>(new SpecificDatumWriter<Record>());
>>>>>>    writer.create(Record.SCHEMA$, outputStream);
>>>>>>
>>>>>>    writer.append(createRecord("r1", 1));
>>>>>>    writer.append(createRecord("r2", 2));
>>>>>>
>>>>>>    writer.close();
>>>>>>    outputStream.close();
>>>>>>  }
>>>>>>
>>>>>> I wrote a version-agnostic Read class:
>>>>>>
>>>>>>  public static void readFromAvro(InputStream is) throws IOException
{
>>>>>>    DataFileStream<Record> reader = new DataFileStream<Record>(
>>>>>>            is, new SpecificDatumReader<Record>());
>>>>>>    for (Record a : reader) {
>>>>>>      System.out.println(ToStringBuilder.reflectionToString(a));
>>>>>>    }
>>>>>>    IOUtils.cleanup(null, is);
>>>>>>    IOUtils.cleanup(null, reader);
>>>>>>  }
>>>>>>
>>>>>> Running the Read code against the v1 data file, and including the
v1
>>>>>> code-generated classes in the classpath produced:
>>>>>>
>>>>>> Record@6a8c436b[name=r1,id=1]
>>>>>> Record@6baa9f99[name=r2,id=2]
>>>>>>
>>>>>> If I run the same code, but use just the v2 generated classes in
the
>>>>>> classpath I get:
>>>>>>
>>>>>> Record@39dd3812[name_rename=r1,new_field=1]
>>>>>> Record@27b15692[name_rename=r2,new_field=2]
>>>>>>
>>>>>> The name_rename field seems to be good, but why would "new_field"
>>>>>> inherit the values of the deleted field "id"?
>>>>>>
>>>>>> Cheers,
>>>>>> Alex
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <cutting@apache.org>
>>>>>>wrote:
>>>>>>> On 09/19/2011 05:12 AM, Alex Holmes wrote:
>>>>>>>> I then modified my original schema by adding, deleting and
renaming
>>>>>>>> some fields, creating version 2 of the schema.  After re-creating
>>>>>>>>the
>>>>>>>> Java classes I attempted to read the version 1 file using
the
>>>>>>>> DataFileStream (with a SpecificDatumReader), and this is
throwing
>>>>>>>>an
>>>>>>>> exception.
>>>>>>>
>>>>>>> This should work.  Can you provide more detail?  What is the
>>>>>>>exception?
>>>>>>>  A reproducible test case would be great to have.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Doug
>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>
>
>
>

Mime
View raw message