hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Capriolo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4732) Speed up AvroSerde by checking hashcodes instead of equality
Date Mon, 01 Jul 2013 19:44:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697096#comment-13697096
] 

Edward Capriolo commented on HIVE-4732:
---------------------------------------


We can not use hashCode() where equals() should be use. It is non-intuitive. Just because
two things have the same hashCode() does not mean that they are equal. Future versions may
not work the same way.

How about something like this:

if this.hashCode == other.hashCode && this.equals(other) {

}

In this way you still get the short circuit optimized behavior you want for performance for
most cases, but the logic is still correct if a hashCode collision happens.

                
> Speed up AvroSerde by checking hashcodes instead of equality
> ------------------------------------------------------------
>
>                 Key: HIVE-4732
>                 URL: https://issues.apache.org/jira/browse/HIVE-4732
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>            Reporter: Mark Wagner
>            Assignee: Mark Wagner
>         Attachments: HIVE-4732.1.patch
>
>
> The AvroSerde spends a significant amount of time checking schema equality. Changing
to compare hashcodes (which can be computed once then reused) will improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message