Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CB8D74AC for ; Tue, 20 Sep 2011 10:44:31 +0000 (UTC) Received: (qmail 52872 invoked by uid 500); 20 Sep 2011 10:44:30 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 52819 invoked by uid 500); 20 Sep 2011 10:44:30 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 52811 invoked by uid 99); 20 Sep 2011 10:44:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Sep 2011 10:44:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of grep.alex@gmail.com designates 209.85.216.49 as permitted sender) Received: from [209.85.216.49] (HELO mail-qw0-f49.google.com) (209.85.216.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Sep 2011 10:44:26 +0000 Received: by qwi2 with SMTP id 2so833502qwi.8 for ; Tue, 20 Sep 2011 03:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=YpnFv8vb85g7cf8oZsQVF1XLHeKCrnWNsg2qtb/Pp3k=; b=t+PmNnbx6ictBiIkDHXBqwmWnqudcMZFsyi8spN7BpX7sD+h3JF90l2i5r7FCOzSSz h4PGdtYaHpBBqIRkoXPzMZuZbUTWf1YFaUrWSqcI6GMwZ1j2Yl/uN4AfMNdLhGVrR9FO YraeUWFSvazEqkDF0WqSDUc+Fg5HOA+yfL4LI= MIME-Version: 1.0 Received: by 10.52.24.20 with SMTP id q20mr540631vdf.368.1316515445251; Tue, 20 Sep 2011 03:44:05 -0700 (PDT) Received: by 10.220.191.196 with HTTP; Tue, 20 Sep 2011 03:44:05 -0700 (PDT) In-Reply-To: References: Date: Tue, 20 Sep 2011 06:44:05 -0400 Message-ID: Subject: Re: Avro versioning and SpecificDatum's From: Alex Holmes To: user@avro.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Created the following ticket: https://issues.apache.org/jira/browse/AVRO-891 Thanks, Alex On Tue, Sep 20, 2011 at 6:26 AM, Alex Holmes wrote: > Thanks, I'll add a bug. > > As a FYI, even without the alias (retaining the original field name), > just removing the "id" field yields the exception. > > On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey wrot= e: >> That looks like a bug. =A0What happens if there is no aliasing/renaming >> involved? =A0Aliasing is a newer feature than field addition, removal, a= nd >> promotion. >> >> This should be easy to reproduce, can you file a JIRA ticket? =A0We shou= ld >> discuss this further there. >> >> Thanks! >> >> >> On 9/19/11 6:14 PM, "Alex Holmes" wrote: >> >>>OK, I was able to reproduce the exception. >>> >>>v1: >>>{"name": "Record", "type": "record", >>> =A0"fields": [ >>> =A0 =A0{"name": "name", "type": "string"}, >>> =A0 =A0{"name": "id", "type": "int"} >>> =A0] >>>} >>> >>>v2: >>>{"name": "Record", "type": "record", >>> =A0"fields": [ >>> =A0 =A0{"name": "name_rename", "type": "string", "aliases": ["name"]} >>> =A0] >>>} >>> >>>Step 1. =A0Write Avro file using v1 generated class >>>Step 2. =A0Read Avro file using v2 generated class >>> >>>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad ind= ex >>> =A0 =A0 =A0 at Record.put(Unknown Source) >>> =A0 =A0 =A0 at org.apache.avro.generic.GenericData.setField(GenericData= .java:463) >>> =A0 =A0 =A0 at >>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader= .j >>>ava:166) >>> =A0 =A0 =A0 at >>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 13 >>>8) >>> =A0 =A0 =A0 at >>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:= 12 >>>9) >>> =A0 =A0 =A0 at org.apache.avro.file.DataFileStream.next(DataFileStream.= java:233) >>> =A0 =A0 =A0 at org.apache.avro.file.DataFileStream.next(DataFileStream.= java:220) >>> =A0 =A0 =A0 at Read.readFromAvro(Unknown Source) >>> =A0 =A0 =A0 at Read.main(Unknown Source) >>> >>>The code to write/read the avro file didn't change from below. >>> >>>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes wrote= : >>>> I'm trying to put together a simple test case to reproduce the >>>> exception. =A0While I was creating the test case, I hit this behavior >>>> which doesn't seem right, but maybe it's my misunderstanding on how >>>> forward/backward compatibility should work: >>>> >>>> Schema v1: >>>> >>>> {"name": "Record", "type": "record", >>>> =A0"fields": [ >>>> =A0 =A0{"name": "name", "type": "string"}, >>>> =A0 =A0{"name": "id", "type": "int"} >>>> =A0] >>>> } >>>> >>>> Schema v2: >>>> >>>> {"name": "Record", "type": "record", >>>> =A0"fields": [ >>>> =A0 =A0{"name": "name_rename", "type": "string", "aliases": ["name"]}, >>>> =A0 =A0{"name": "new_field", "type": "int", "default":"0"} >>>> =A0] >>>> } >>>> >>>> In the 2nd version I: >>>> >>>> - removed field "id" >>>> - renamed field "name" to "name_rename" >>>> - added field "new_field" >>>> >>>> I write the v1 data file: >>>> >>>> =A0public static Record createRecord(String name, int id) { >>>> =A0 =A0Record record =3D new Record(); >>>> =A0 =A0record.name =3D name; >>>> =A0 =A0record.id =3D id; >>>> =A0 =A0return record; >>>> =A0} >>>> >>>> =A0public static void writeToAvro(OutputStream outputStream) >>>> =A0 =A0 =A0throws IOException { >>>> =A0 =A0DataFileWriter writer =3D >>>> =A0 =A0 =A0 =A0new DataFileWriter(new SpecificDatumWriter()); >>>> =A0 =A0writer.create(Record.SCHEMA$, outputStream); >>>> >>>> =A0 =A0writer.append(createRecord("r1", 1)); >>>> =A0 =A0writer.append(createRecord("r2", 2)); >>>> >>>> =A0 =A0writer.close(); >>>> =A0 =A0outputStream.close(); >>>> =A0} >>>> >>>> I wrote a version-agnostic Read class: >>>> >>>> =A0public static void readFromAvro(InputStream is) throws IOException = { >>>> =A0 =A0DataFileStream reader =3D new DataFileStream( >>>> =A0 =A0 =A0 =A0 =A0 =A0is, new SpecificDatumReader()); >>>> =A0 =A0for (Record a : reader) { >>>> =A0 =A0 =A0System.out.println(ToStringBuilder.reflectionToString(a)); >>>> =A0 =A0} >>>> =A0 =A0IOUtils.cleanup(null, is); >>>> =A0 =A0IOUtils.cleanup(null, reader); >>>> =A0} >>>> >>>> Running the Read code against the v1 data file, and including the v1 >>>> code-generated classes in the classpath produced: >>>> >>>> Record@6a8c436b[name=3Dr1,id=3D1] >>>> Record@6baa9f99[name=3Dr2,id=3D2] >>>> >>>> If I run the same code, but use just the v2 generated classes in the >>>> classpath I get: >>>> >>>> Record@39dd3812[name_rename=3Dr1,new_field=3D1] >>>> Record@27b15692[name_rename=3Dr2,new_field=3D2] >>>> >>>> The name_rename field seems to be good, but why would "new_field" >>>> inherit the values of the deleted field "id"? >>>> >>>> Cheers, >>>> Alex >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting >>>>wrote: >>>>> On 09/19/2011 05:12 AM, Alex Holmes wrote: >>>>>> I then modified my original schema by adding, deleting and renaming >>>>>> some fields, creating version 2 of the schema. =A0After re-creating = the >>>>>> Java classes I attempted to read the version 1 file using the >>>>>> DataFileStream (with a SpecificDatumReader), and this is throwing an >>>>>> exception. >>>>> >>>>> This should work. =A0Can you provide more detail? =A0What is the exce= ption? >>>>> =A0A reproducible test case would be great to have. >>>>> >>>>> Thanks, >>>>> >>>>> Doug >>>>> >>>> >> >> >> >