avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Holmes <grep.a...@gmail.com>
Subject Re: Avro versioning and SpecificDatum's
Date Tue, 20 Sep 2011 01:08:05 GMT
I'm trying to put together a simple test case to reproduce the
exception.  While I was creating the test case, I hit this behavior
which doesn't seem right, but maybe it's my misunderstanding on how
forward/backward compatibility should work:

Schema v1:

{"name": "Record", "type": "record",
  "fields": [
    {"name": "name", "type": "string"},
    {"name": "id", "type": "int"}
  ]
}

Schema v2:

{"name": "Record", "type": "record",
  "fields": [
    {"name": "name_rename", "type": "string", "aliases": ["name"]},
    {"name": "new_field", "type": "int", "default":"0"}
  ]
}

In the 2nd version I:

- removed field "id"
- renamed field "name" to "name_rename"
- added field "new_field"

I write the v1 data file:

  public static Record createRecord(String name, int id) {
    Record record = new Record();
    record.name = name;
    record.id = id;
    return record;
  }

  public static void writeToAvro(OutputStream outputStream)
      throws IOException {
    DataFileWriter<Record> writer =
        new DataFileWriter<Record>(new SpecificDatumWriter<Record>());
    writer.create(Record.SCHEMA$, outputStream);

    writer.append(createRecord("r1", 1));
    writer.append(createRecord("r2", 2));

    writer.close();
    outputStream.close();
  }

I wrote a version-agnostic Read class:

  public static void readFromAvro(InputStream is) throws IOException {
    DataFileStream<Record> reader = new DataFileStream<Record>(
            is, new SpecificDatumReader<Record>());
    for (Record a : reader) {
      System.out.println(ToStringBuilder.reflectionToString(a));
    }
    IOUtils.cleanup(null, is);
    IOUtils.cleanup(null, reader);
  }

Running the Read code against the v1 data file, and including the v1
code-generated classes in the classpath produced:

Record@6a8c436b[name=r1,id=1]
Record@6baa9f99[name=r2,id=2]

If I run the same code, but use just the v2 generated classes in the
classpath I get:

Record@39dd3812[name_rename=r1,new_field=1]
Record@27b15692[name_rename=r2,new_field=2]

The name_rename field seems to be good, but why would "new_field"
inherit the values of the deleted field "id"?

Cheers,
Alex







On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <cutting@apache.org> wrote:
> On 09/19/2011 05:12 AM, Alex Holmes wrote:
>> I then modified my original schema by adding, deleting and renaming
>> some fields, creating version 2 of the schema.  After re-creating the
>> Java classes I attempted to read the version 1 file using the
>> DataFileStream (with a SpecificDatumReader), and this is throwing an
>> exception.
>
> This should work.  Can you provide more detail?  What is the exception?
>  A reproducible test case would be great to have.
>
> Thanks,
>
> Doug
>

Mime
View raw message