Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0553517CA2 for ; Tue, 3 Feb 2015 11:59:48 +0000 (UTC) Received: (qmail 16560 invoked by uid 500); 3 Feb 2015 11:59:48 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 16491 invoked by uid 500); 3 Feb 2015 11:59:48 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 16481 invoked by uid 99); 3 Feb 2015 11:59:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 11:59:48 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of emrekabakci@gmail.com designates 74.125.82.181 as permitted sender) Received: from [74.125.82.181] (HELO mail-we0-f181.google.com) (74.125.82.181) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 11:59:42 +0000 Received: by mail-we0-f181.google.com with SMTP id k48so44551351wev.12 for ; Tue, 03 Feb 2015 03:57:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :content-type; bh=IpF83hX+QIz2uqUCyIxzoO67u+vadTxJMj8beuuv1LE=; b=xZYssq4ywC9JDL6HWWfYtPIU2mc+ARsTXLnkS3fTiYOBLWo3PoFl4gECucwLeplJmb 43mjVhLOrATn8bk2E6YmYdiKYXrlhokanCWLvdz5e3rMUScjfTI1BN7wgLEufibHbXd3 OcE4UpSg5w/RbJMzwf8t+0f+i+8z5C0/JRMoZfx03rQrnAiSGTTc7AXNK+ywbslx7r6O hqdSEgwtrbzrdPRaWHZ8QrKjU/Y34ofwxLKJ5AdeqJC0tkyrMLAstsAIFh7/HY2lSowO QTFLLS/RE44H0NmQpMMdqZT04mEjn8ddmkfZkVNALhj9T88pojOBMR/ReqATg6MAhQ/n ISpg== X-Received: by 10.194.9.98 with SMTP id y2mr32453737wja.85.1422964670787; Tue, 03 Feb 2015 03:57:50 -0800 (PST) Received: from [192.168.0.11] ([178.233.56.35]) by mx.google.com with ESMTPSA id yy9sm5045975wjc.20.2015.02.03.03.57.49 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 03 Feb 2015 03:57:50 -0800 (PST) Date: Tue, 3 Feb 2015 13:57:47 +0200 From: Burak Emre To: user@avro.apache.org Message-ID: <086FD9C0ED6A46C297C861B1369617E5@gmail.com> In-Reply-To: References: Subject: Adding new field with default value to an Avro schema X-Mailer: sparrow 1.6.4 (build 1178) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="54d0b7bb_4c04a8af_16c" X-Virus-Checked: Checked by ClamAV on apache.org --54d0b7bb_4c04a8af_16c Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline > I added a field with a default value to an Avro schema which is previously used for writing data. Is it possible to read the previous data using only new schema which has that new field at the end? > > I tried this scenario but unfortunately it throws EOFException while reading third field. Even though it has a default value and the previous fields is read successfully, I'm not able to de-serialize the record back without providing the writer schema I used previously. > > Schema schema = Schema.createRecord("test", null, "avro.test", false); schema.setFields(Lists.newArrayList( new Field("project", Schema.create(Type.STRING), null, null), new Field("city", Schema.createUnion(Lists.newArrayList(Schema.create(Type.NULL), Schema.create(Type.STRING))), null, NullNode.getInstance()) )); GenericData.Record record = new GenericRecordBuilder(schema) .set("project", "ff").build(); GenericDatumWriter w = new GenericDatumWriter(schema); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(outputStream, null); w.write(record, encoder); encoder.flush(); schema = Schema.createRecord("test", null, "avro.test", false); schema.setFields(Lists.newArrayList( new Field("project", Schema.create(Type.STRING), null, null), new Field("city", Schema.createUnion(Lists.newArrayList(Schema.create(Type.NULL), Schema.create(Type.STRING))), null, NullNode.getInstance()), new Field("newField", Schema.createUni on(Lists.newArrayList(Schema.create(Type.NULL), Schema.create(Type.STRING))), null, NullNode.getInstance()) )); DatumReader reader = new GenericDatumReader<>(schema); Decoder decoder = DecoderFactory.get().binaryDecoder(outputStream.toByteArray(), null); GenericRecord result = reader.read(null, decoder); > --54d0b7bb_4c04a8af_16c Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
I added = a field with a default value to an Avro schema which is previously used f= or writing data. Is it possible to read the previous data using only new= schema which has that new field at the end=3F

I tried this scenario but = unfortunately it throws EO=46Exception while reading third field. Even th= ough it has a default value and the previous fields is read successfully,= I'm not able to de-serialize the record back without providing the write= r schema I used previously.

Schema schema =3D Schema<=
/span>.createRecord(=22test=22, null, =22avro.test=22, false);
schema.set=46ields(Lists.newArrayList(
    new =46ield(=22project=22, <=
/span>Schema.create(Type.STRING=
), null, null),
    new =46ield(=22city=22, Schema.createUnion(Lists.newArrayLis=
t(Schema.create(Type.NULL), Schema.create(Type<=
/span>.STRING))), null, Null=
Node.getInstance())
));

GenericData.Record=
 record =3D new GenericRecordBuilder(schema)
    .set(=22p=
roject=22, =22ff=22=
).build();

GenericDatumWriter w =3D=
 new G=
enericDatumWriter(schema);
ByteArrayOutputStream outputStream =3D new <=
span style=3D=22margin: 0px; padding: 0px; border: 0px; color: rgb(43, 14=
5, 175);=22>ByteArrayOutputStream();
BinaryEncoder encoder =3D=
 Encoder=46actory.get().binaryEncoder(outputStream, null);

w.write(record=
, encoder);
encoder.flush();

schema =3D Schema.createRecord(=22t=
est=22, null, =22avro.test=22,<=
span style=3D=22margin: 0px; padding: 0px; border: 0px; color: rgb(0, 0, =
0);=22> false);
schema.set=46ields(Lists.newArrayList(
        new =46ield(=22project=22, Schema.create<=
span style=3D=22margin: 0px; padding: 0px; border: 0px; color: rgb(0, 0, =
0);=22>(Type.STRING), null,<=
span style=3D=22margin: 0px; padding: 0px; border: 0px; color: rgb(0, 0, =
0);=22> null),
        new =46ield(=22city=22, =
Schema.createUnion(Lists.newArr=
ayList(Schema.create(Type.NULL), Schema=
.create(T=
ype.STRING))), null, NullNode.getInstance()),
        new =46ield(=22new=46ield=22, Schema.createUnio=
n(Lists.=
newArrayList(Schema.create(Type.<=
/span>NULL), Sche=
ma.create(Type.STRING))), null, NullNode.getInstance())
));

DatumReader<Gene=
ricRecord> reader =3D new Generic=
DatumReader<>(schema);=

Decoder decoder =3D Decoder=46actory.get().binaryDecoder(outputStream.toByteArray(), null);
GenericRecord result =3D=
 reader.read(=
null, decoder);
=20

--54d0b7bb_4c04a8af_16c--