Return-Path: X-Original-To: apmail-avro-dev-archive@www.apache.org Delivered-To: apmail-avro-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A6D8A17B7E for ; Thu, 9 Apr 2015 08:56:17 +0000 (UTC) Received: (qmail 68774 invoked by uid 500); 9 Apr 2015 08:56:12 -0000 Delivered-To: apmail-avro-dev-archive@avro.apache.org Received: (qmail 68707 invoked by uid 500); 9 Apr 2015 08:56:12 -0000 Mailing-List: contact dev-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@avro.apache.org Delivered-To: mailing list dev@avro.apache.org Received: (qmail 68695 invoked by uid 99); 9 Apr 2015 08:56:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Apr 2015 08:56:12 +0000 Date: Thu, 9 Apr 2015 08:56:12 +0000 (UTC) From: "Nicolas PHUNG (JIRA)" To: dev@avro.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AVRO-1661) Schema Evolution not working MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AVRO-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486993#comment-14486993 ] Nicolas PHUNG commented on AVRO-1661: ------------------------------------- Hello Ryan Blue, Indeed, I'm wrapping Avro Record inside Kafka message. If I'm using : {quote} DatumReader reader = new GenericDatumReader(eventSchema, newSchema); {quote} This works fine. However, unlike Flume event, I don't have a header that contains the Avro Schema for the data. I don't know if it's possible to get the schema from a binary avro input. {quote} Schema eventSchema = schema(event); {quote} I don't know if it's the right way to do this, but I'm writing these Avro Record in a Kafka topic based from a initial schema. Later on, this based schema evolve with additional fields or other modification. I'd like to be able to read old Avro Record from this Kafka topic as well as the new Avro Record with the new schema. This indeed works once if I have the based schema and a new one but If I plan on iterating several evolution of the Schema it won't be possible. Is there an Avro's/proper way to do this ? Thanks for your help. > Schema Evolution not working > ----------------------------- > > Key: AVRO-1661 > URL: https://issues.apache.org/jira/browse/AVRO-1661 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.7.6, 1.7.7 > Environment: Ubuntu 14.10 > Reporter: Nicolas PHUNG > Labels: avsc, evolution, schema > > This is the Avro Schema (OLD) I was using to write Avro binary data before: > {noformat} > { > "namespace": "com.hello.world", > "type": "record", > "name": "Toto", > "fields": [ > { > "name": "a", > "type": [ > "string", > "null" > ] > }, > { > "name": "b", > "type": "string" > } > ] > } > {noformat} > This is the Avro Schema (NEW) I'm using to read the Avro binary data : > {noformat} > { > "namespace": "com.hello.world", > "type": "record", > "name": "Toto", > "fields": [ > { > "name": "a", > "type": [ > "string", > "null" > ] > }, > { > "name": "b", > "type": "string" > }, > { > "name": "c", > "type": "string", > "default": "na" > } > ] > } > {noformat} > However, I can't read the old data with the new Schema. I've got the following errors : > {noformat} > 15/04/08 17:32:22 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 (TID 3) > java.io.EOFException > at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) > at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:272) > at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:113) > at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:353) > at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) > at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) > at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) > at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) > at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > at com.miguno.kafka.avro.AvroDecoder.fromBytes(AvroDecoder.scala:31) > {noformat} > From my understanding, I should be able to read the old data with the new schema that contains a new field with a default value. But it doesn't seem to work. Am I doing something wrong ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)