hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Swarnim Kulkarni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11288) Avro SerDe InstanceCache returns incorrect schema
Date Tue, 28 Jul 2015 17:44:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644736#comment-14644736
] 

Swarnim Kulkarni commented on HIVE-11288:
-----------------------------------------

{quote}
The instanceCache could probably be a singleton, and shouldn't really require an equals method
(unless I am mistaken).
{quote}

If I am not missing something, I think that is a dangerous assumption to make. Mostly because
currently there is nothing on the InstanceCache that states that it should be used as a singleton[1].
So I would vote that we either specifically mark that by making the constructor on the class
private or implement hashcode and equals on the class. Also could be nice to mark those final
so that we do not have bad overrides of those?

[1] https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java#L38

> Avro SerDe InstanceCache returns incorrect schema
> -------------------------------------------------
>
>                 Key: HIVE-11288
>                 URL: https://issues.apache.org/jira/browse/HIVE-11288
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Greg Phillips
>            Assignee: Greg Phillips
>         Attachments: HIVE-11288.2.patch, HIVE-11288.3.patch, HIVE-11288.4.patch, HIVE-11288.patch
>
>
> To reproduce this error, take two fields in an avro schema document matching the following:
> "type" :  { "type": "array", "items": [ "null",  { "type": "map", "values": [ "null",
"string" ] } ]  }
> "type" : { "type": "map", "values": [ "null" , { "type": "array", "items": [ "null" ,
"string"] } ] }
> After creating two tables in hive with these schemas, the describe statement on each
of them will only return the schema for the first one loaded.  This is due to a hashCode()
collision in the InstanceCache.  
> A patch will be included in this ticket shortly which removes the hashCode call from
the InstanceCache's internal HashMap, and instead provides the entire schema object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message