Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 75FE4FF26 for ; Fri, 5 Apr 2013 11:59:42 +0000 (UTC) Received: (qmail 2861 invoked by uid 500); 5 Apr 2013 11:59:42 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 2452 invoked by uid 500); 5 Apr 2013 11:59:41 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 2437 invoked by uid 99); 5 Apr 2013 11:59:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 11:59:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jcoveney@gmail.com designates 209.85.128.53 as permitted sender) Received: from [209.85.128.53] (HELO mail-qe0-f53.google.com) (209.85.128.53) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 11:59:35 +0000 Received: by mail-qe0-f53.google.com with SMTP id q19so1951171qeb.40 for ; Fri, 05 Apr 2013 04:59:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=AkQGq1dX41gKqJUMbkDwotDjQ3PVBnqTHoQSIsDknnI=; b=j3fNI1XCg/QBqhPnSpLq6xXItZVcBuNecnZLMuekrgvvZcI0KJ2zxl8/Fe81e+KrJH DUP31h0rb37mxyAyxJI9YucSu/8HqCTSOLsk9AWw7bcCbJNs5FPvq7wu66Nhqh4vGue4 4jj9mjezsQYgVC8efy3cOC5UjtnTFr1CgSouhaSz3eq/a4N66g8knc5GTayOJs/+FwSW fsUNrHsH9RWIKO9oUphX3PeX2ftRPYpqQ077G5y1LOZmk41aI9+v+AI1hjK0XURp7NSq Ni/vcUUUxiS0wKDt8SLP5mJLQBjD1hznoztsMwrtcgrxWcnT0bGUYeAB+1Bs/qvnLSZo PrOw== MIME-Version: 1.0 X-Received: by 10.229.138.78 with SMTP id z14mr156725qct.57.1365163154867; Fri, 05 Apr 2013 04:59:14 -0700 (PDT) Received: by 10.49.17.135 with HTTP; Fri, 5 Apr 2013 04:59:14 -0700 (PDT) In-Reply-To: References: Date: Fri, 5 Apr 2013 13:59:14 +0200 Message-ID: Subject: Re: Issue writing union in avro? From: Jonathan Coveney To: user@avro.apache.org Content-Type: multipart/alternative; boundary=e89a8f64753fe80bcb04d99bd090 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f64753fe80bcb04d99bd090 Content-Type: text/plain; charset=UTF-8 Ok, I figured out the issue: If you make string c the following: String c = "{\"name\": \"Alyssa\", \"favorite_number\": {\"int\": 256}, \"favorite_color\": {\"string\": \"blue\"}}"; Then this works. This represents a divergence between the python and the Java implementation... the above does not work in Python, but it does work in Java. And of course, vice versa. I think I know how to fix this (and can file a bug with my reproduction and the fix), but I'm not sure which one is the expected case? Which implementation is wrong? Thanks 2013/4/5 Jonathan Coveney > Correction: the issue is when reading the string according to the avro > schema, not on writing. it fails before I get a chance to write :) > > > 2013/4/5 Jonathan Coveney > >> I implemented essentially the Java avro example but using the >> GenericDatumWriter and GenericDatumReader and hit an issue. >> >> https://gist.github.com/jcoveney/5317904 >> >> This is the error: >> Exception in thread "main" java.lang.RuntimeException: >> org.apache.avro.AvroTypeException: Expected start-union. Got >> VALUE_NUMBER_INT >> at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:45) >> Caused by: org.apache.avro.AvroTypeException: Expected start-union. Got >> VALUE_NUMBER_INT >> at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697) >> at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441) >> at >> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) >> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) >> at >> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) >> at >> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) >> at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:38) >> >> Am I doing something wrong? Is this a bug? I'm digging in now but am >> curious if anyone has seen this before? >> >> I get the feeling I am working with Avro in a way that most people do not >> :) >> >> > --e89a8f64753fe80bcb04d99bd090 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Ok, I figured out the issue:

If you = make string c the following:
String c =3D "{\"name\": \&q= uot;Alyssa\", \"favorite_number\": {\"int\": 256},= \"favorite_color\": {\"string\": \"blue\"}}&= quot;;

Then this works.

This represents a divergence betwee= n the python and the Java implementation... the above does not work in Pyth= on, but it does work in Java. And of course, vice versa.

I thi= nk I know how to fix this (and can file a bug with my reproduction and the = fix), but I'm not sure which one is the expected case? Which implementa= tion is wrong?

Thanks


2013/4/5 Jonathan Coveney <jcoveney@gmail.com>
Correction: the issue is when reading the string according= to the avro schema, not on writing. it fails before I get a chance to writ= e :)


2013/4/5 Jonathan Coveney <jcoveney@gm= ail.com>
I implemented essentially t= he Java avro example but using the GenericDatumWriter and GenericDatumReade= r and hit an issue.
This is the error:
Exception in thread "main" j= ava.lang.RuntimeException: org.apache.avro.AvroTypeException: Expected star= t-union. Got VALUE_NUMBER_INT
=C2=A0=C2=A0=C2=A0 at com.spotify.hadoop.m= apred.Hrm.main(Hrm.java:45)
Caused by: org.apache.avro.AvroTypeException: Expected start-union. Got VAL= UE_NUMBER_INT
=C2=A0=C2=A0=C2=A0 at org.apache.avro.io.JsonDecoder.error= (JsonDecoder.java:697)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.io.JsonDeco= der.readIndex(JsonDecoder.java:441)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.io.ResolvingDecoder.doAction(Resolvin= gDecoder.java:229)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.io.parsing.Pars= er.advance(Parser.java:88)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.io.Reso= lvingDecoder.readIndex(ResolvingDecoder.java:206)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.generic.GenericDatumReader.read(Gener= icDatumReader.java:152)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.generic.Ge= nericDatumReader.readRecord(GenericDatumReader.java:177)
=C2=A0=C2=A0=C2= =A0 at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.j= ava:148)
=C2=A0=C2=A0=C2=A0 at org.apache.avro.generic.GenericDatumReader.read(Gener= icDatumReader.java:139)
=C2=A0=C2=A0=C2=A0 at com.spotify.hadoop.mapred.= Hrm.main(Hrm.java:38)

Am I doing something wrong? Is this= a bug? I'm digging in now but am curious if anyone has seen this befor= e?

I get the feeling I am working with Avro in a way that most people do n= ot :)



--e89a8f64753fe80bcb04d99bd090--