Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A8DF1116B8 for ; Wed, 27 Aug 2014 18:08:27 +0000 (UTC) Received: (qmail 58552 invoked by uid 500); 27 Aug 2014 18:08:27 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 58466 invoked by uid 500); 27 Aug 2014 18:08:27 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 58447 invoked by uid 99); 27 Aug 2014 18:08:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Aug 2014 18:08:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mkwhitacre@gmail.com designates 209.85.219.44 as permitted sender) Received: from [209.85.219.44] (HELO mail-oa0-f44.google.com) (209.85.219.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Aug 2014 18:08:21 +0000 Received: by mail-oa0-f44.google.com with SMTP id eb12so483729oac.17 for ; Wed, 27 Aug 2014 11:08:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=KRwd68EHbRmfT98PbWgFo566zoz6Ej95Z+6HE3FRMmc=; b=AqzbfH68m24O9AtrVJYfw2NuCgmrTs05jw6XIOxp7SY9Wiiv0ZuuzzWv0T65gLRZiu FVf+zIG2iKBJ1ASSqROUIrI7lDO0SkzB3rixT/lDBQPvmMgv7UZ34EKB48xmt2zQiwys bmnOS/WoW3/KIHxH6Cg+MhKBHCcq7CfDKLCGwwC4o+/SBi07lA4RElgka8r/IlOLrCOB cCuz+9PaSNLVx76XH78PAYhsjsR1ChN3NmOYd2vB86nzioB26AN5Oy28fDexc70L6qiT LASBPfR9XcZR9Z+M7bF06tM0qWzAMRwP4hIrTHNg38VSTz8qZcxjKCzsjEm8CTAtveCw DCHg== MIME-Version: 1.0 X-Received: by 10.60.220.169 with SMTP id px9mr17733867oec.67.1409162881346; Wed, 27 Aug 2014 11:08:01 -0700 (PDT) Received: by 10.202.78.133 with HTTP; Wed, 27 Aug 2014 11:08:01 -0700 (PDT) Date: Wed, 27 Aug 2014 13:08:01 -0500 Message-ID: Subject: Passively Converting Null Map to be valid From: Micah Whitacre To: user@avro.apache.org Content-Type: multipart/alternative; boundary=001a1133e0e4f9339d0501a04cc0 X-Virus-Checked: Checked by ClamAV on apache.org --001a1133e0e4f9339d0501a04cc0 Content-Type: text/plain; charset=UTF-8 We've recently upgraded our Avro dependency to 1.7.7 from 1.7.5. The avdl for Avro used to look like this: record Event{ long creationTime; /** * The optional payload of the event. */ union{null, bytes} value = null; /** * Optional properties of the event. */ map properties = null; } When we upgraded we started seeing warnings like this: [WARNING] Avro: Invalid default for field properties: null not a {"type":"map","values":"string"} So we converted the file to be this: record Event{ long creationTime; /** * The optional payload of the event. */ union{null, bytes} value = null; /** * Optional properties of the event. */ map properties = {}; } We also had the need to add a new field to the record and thought we could do so passively like so: record Event{ long creationTime; /** * The optional payload of the event. */ union{null, bytes} value = null; /** * Optional properties of the event. */ map properties = {}; /** * Type of operation */ union{null, string} operation = null; } However when we then read data that was written with the very first schema we get an EOFException. Caused by: java.io.EOFException at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155) at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) We are reading with a BufferedBinaryDecoder and using the new schema as both the written and reader schema because the written schema is not preserved with the payload so it is not easy to retrieve. My questions are: 1. Is the change we made to add a new defaulted union truly non-passive? 2. Is there a workaround so I can continue to evolve my schema? Thanks for the help, Micah --001a1133e0e4f9339d0501a04cc0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
We've recently upgraded our Avro dependency to 1.= 7.7 from 1.7.5. =C2=A0The avdl for Avro used to look like this:
<= br>
record Event{

=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0long creationTime;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/**
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0* The optional payload of the event.
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0union{nu= ll, bytes} value =3D null;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/**<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Optional properties of the ev= ent.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0map<string> properties =3D null;
}
=

When we upgraded we started seeing warnings like this:<= /div>

[WARNING] Avro: Invalid default for field properti= es: null not a {"type":"map","values":"s= tring"}

So we converted the file to be this:
record Event{

=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0long creationTime;
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0/**
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* The opti= onal payload of the event.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0union{null, bytes} value =3D null;
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0/**
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Optional= properties of the event.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0map<string> properties =3D = {};
}

We also had the need to add a n= ew field to the record and thought we could do so passively like so:
<= div>
record Event{

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0long creationTime;
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0/**
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* T= he optional payload of the event.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0union{null, bytes} value = =3D null;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/**
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0* Optional properties of the event.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0map<string> properties =3D {};
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/**
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0* Type of operation
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0union{null, string} operation =3D= null;
}

However when we then read data = that was written with the very first schema we get an EOFException. =C2=A0<= /div>

Caused by: java.io.EOFException
at org.apache.avro.io.Binar= yDecoder.ensureBounds(BinaryDecoder.java:473)
at org.apache.avro.= io.BinaryDecoder.readInt(BinaryDecoder.java:128)
at org.apache.avro.io.BinaryDecoder.read= Index(BinaryDecoder.java:423)
at org.apache.avro.= io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
at org.apache.avro.io.parsing.Par= ser.advance(Parser.java:88)
at org.apache.avro.= io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
at org.apache.avro.generic.Gener= icDatumReader.read(GenericDatumReader.java:155)
at org.apache.avro.= generic.GenericDatumReader.readField(GenericDatumReader.java:193)
at org.apache.avro.gene= ric.GenericDatumReader.readRecord(GenericDatumReader.java:183)
at org.apache.avro.= generic.GenericDatumReader.read(GenericDatumReader.java:151)
at org.apache.avro.generic.G= enericDatumReader.read(GenericDatumReader.java:142)

We are reading with a BufferedBinaryDecoder = and using the new schema as both the written and reader schema because the = written schema is not preserved with the payload so it is not easy to retri= eve.

My questions are:
1. Is the change we made to= add a new defaulted union truly non-passive?
2. Is there a worka= round so I can continue to evolve my schema?

Thanks for the help,
Micah
--001a1133e0e4f9339d0501a04cc0--