avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-966) Bug in GenericData#resolveUnion when resolving union of null and array
Date Wed, 07 Dec 2011 20:28:40 GMT

     [ https://issues.apache.org/jira/browse/AVRO-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Doug Cutting updated AVRO-966:
------------------------------

    Attachment: AVRO-966.patch

I would rather make isRecord() correct, as it may be used elsewhere.  Also, when folks have
very large unions, where the performance of getSchemaName() is most critical, they usually
have lots of records, since only a single instance of any unnamed type is allowed in a union,
while many records are permitted, so I'd rather not move the record test down in GenericData#resolveUnion().

Here's a version that fixes ReflectData#isRecord() for ByteBuffer by fixing getSchema(), so
that performance is not affected.  I also improved the fix for arrays to only check for Collection,
since that's the only case that creates problems for getSchema(), and throw an exception in
getSchema() if a Collection is passed to better detect such problems in the future.

I'll commit this version, okay?
                
> Bug in GenericData#resolveUnion when resolving union of null and array
> ----------------------------------------------------------------------
>
>                 Key: AVRO-966
>                 URL: https://issues.apache.org/jira/browse/AVRO-966
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.6.1
>            Reporter: Vyacheslav Zholudev
>            Assignee: Doug Cutting
>             Fix For: 1.6.2
>
>         Attachments: AVRO-966-2.patch, AVRO-966.patch, AVRO-966.patch
>
>
> I have a simple avro schema from which I generate an avro specific object:
> {{
> {"type": "record",
>   "name": "org.company.Test",
>   "fields": [
>     { "name": "arr","type": ["null", {"type": "array","items": "float" }], "default":
null }
>   ]
> }
> }}
> Then a simple piece of code to reproduce a bug:
> {{
>   Test test = new Test();
>   List<Float> list = new ArrayList<Float>();
>   list.add(1.1f);
>   list.add(2.2f);
>   test.setArr(list);
>   
>   DataFileWriter<Test> myWriter = new DataFileWriter<Test>(new ReflectDatumWriter(test.getSchema()));
>   File f = new File("/tmp/test.avro");
>   myWriter.create(test.getSchema(), f);
>   myWriter.append(test);
>   myWriter.close();
> }}
> I get an exception:
> {{
> Exception in thread "main" org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"array","items":"float"}]:
[1.1, 2.2]
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:261)
>         <my code>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"array","items":"float"}]:
[1.1, 2.2]
> 	at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:549)
> 	at org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:137)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102)
> 	at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:105)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:102)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:255)
> }}
> My investigation showed that in {{GenericData#resolveUnion}} method {{getSchemaName()}}
is called. And the latter method when checks whether {{datum}} is a record, succeeds. Why
it happens boils down to the fact that in {{ReflectData#createSchema}} an "if"-body under
case {{(type instanceof ParameterizedType)}} is not executed.
> I can supply more details if needed. Or explain in a clear way if I didn't manage to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message