avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-997) Union of enum and null cannot be serialized
Date Thu, 11 Oct 2012 22:39:03 GMT

    [ https://issues.apache.org/jira/browse/AVRO-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474573#comment-13474573
] 

Sean Busbey commented on AVRO-997:
----------------------------------

Oh no, I definitely think the current behavior is a bug. Specifically having validate pass
but write fail is incorrect. I also think that my enum shouldn't stop working just because
it's in a union, omitting ambiguities such as the string-enum union.

Making everything consistent by checking to see if the toString is in the enum member of a
union after [GenericDate.resolveUnion has done the existing membership checks (~line 574)|http://svn.apache.org/viewvc/avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java?view=markup&pathrev=1384500]
would solve the part of this that I consider a bug, while making things easier for existing
users (like the Hive issue that brought me here). I'll post a patch tomorrow showing how this
would work (and an alternate set of tests that would correspond to the failing prior to the
fix).

I think in any case there'll be a documentation component, either calling out the use of GenericEnumSymbol
or warning users about the possibly surprising case of enum members ending up as avro strings
in serialized data.
                
> Union of enum and null cannot be serialized
> -------------------------------------------
>
>                 Key: AVRO-997
>                 URL: https://issues.apache.org/jira/browse/AVRO-997
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.5.1
>            Reporter: Aaron Kimball
>            Assignee: Sean Busbey
>             Fix For: 1.8.0
>
>         Attachments: AVRO-997.patch, AVRO-997.patch
>
>
> I have a schema like:
> {code}
> [
> {
>   "type": "enum",
>   "name": "Gender",
>   "symbols": ["M", "F"]
> },
> {
>   "type" : "record",
>   "name" : "Foo",
>   "fields" : [
>     { "type" : ["Gender", "null"], "name" : "gender" },
>     ...
>   ]
> }
> ]
> {code}
> I build a record like {{Foo foo = new Foo(); foo.gender = Gender.M;}}
> When I go to serialize this, I get:
> {code}Not in union [{"type":"enum","name":"Gender","symbols":["M","F"]},"null"]: M
> 	at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:482)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
> 	at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message