avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sachin Goyal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1562) Add support for types extending Maps/Collections
Date Fri, 15 Aug 2014 19:33:20 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098980#comment-14098980
] 

Sachin Goyal commented on AVRO-1562:
------------------------------------

I have a patch ready for this.
Will be submitting that shortly.

> Add support for types extending Maps/Collections
> ------------------------------------------------
>
>                 Key: AVRO-1562
>                 URL: https://issues.apache.org/jira/browse/AVRO-1562
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.7.6
>            Reporter: Sachin Goyal
>
> Consider the following code:
> {code}
> import java.io.ByteArrayOutputStream;
> import java.util.*;
> import org.apache.avro.Schema;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.reflect.ReflectData;
> import org.apache.avro.reflect.ReflectDatumWriter;
> public class AvroDerivingMaps
> {
>     public static void main (String [] args) throws Exception
>     {
>         MapDerivedContainer orig = new MapDerivedContainer();
>         ReflectData rdata = ReflectData.AllowNull.get();
>         Schema schema = rdata.getSchema(MapDerivedContainer.class);
>         System.out.println(schema);
>         
>         ReflectDatumWriter<MapDerivedContainer> datumWriter = new ReflectDatumWriter
(MapDerivedContainer.class, rdata);
>         DataFileWriter<MapDerivedContainer> fileWriter = new DataFileWriter<MapDerivedContainer>
(datumWriter);
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         fileWriter.create(schema, baos);
>         fileWriter.append(orig);
>         fileWriter.close();
>     }
> }
> class MapDerived extends HashMap<String, Integer>
> {
>     Integer a = 1;
>     String b = "b";
> }
> class MapDerivedContainer
> {
>     MapDerived2 map = new MapDerived2();
> }
> class MapDerived2 extends MapDerived
> {
>     String c = "c";
> }
> {code}
> \\
> \\
> It throws the following exception:
> {code:javascript}
> {"type":"record","name":"MapDerivedContainer","namespace":"avro","fields":[{"name":"map","type":["null",{"type":"record","name":"MapDerived2","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}],"default":null}]}
> {code}
> {color:brown}
> Exception in thread "main" org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: 
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"record","name":"MapDerived2","namespace":"avro","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}]:
{}
> 	at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:600)
> 	at org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
> 	at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)
> 	at org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:203)
> 	at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
> 	at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290)
> 	... 1 more
> {color}
> \\
> \\
> It appears that ReflectData#createSchema() checks for "type instanceof ParameterizedType"
and because of this, it skips handling of the map.
> The same is not true of GenericData#isMap() and GenericData#resolveUnion() fails because
of this.
> The same may be true for classes extending ArrayList, Collection, Set etc.
> Also, note the schema for the class extending Map:
> {code:javascript}
> {  
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       {  
>          "name":"a",
>          "type":[  
>             "null",
>             "int"
>          ],
>          "default":null
>       },
>       {  
>          "name":"b",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       }
>    ]
> }
> {code}
> This schema ignores the Map completely.
> Probably, for such a class, the schema should look like:
> {code:javascript}
> {
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       .... // Other fields in the class extending the Map
>      {
>         "name":"BASE_MAP",
>          "type":[
>             "null",
>             "map" ... // Normal map which the class extends (implements?)
>          ],
>          "default":null
>      }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message