avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Douglas Kaminsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-853) Cache hash codes in Schema and Field
Date Thu, 07 Jul 2011 13:32:20 GMT

    [ https://issues.apache.org/jira/browse/AVRO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061293#comment-13061293
] 

Douglas Kaminsky commented on AVRO-853:
---------------------------------------

Further, I question whether properties actually need to be part of the hash code. Certainly
they factor in to the equality check, but what is the real harm if two perfectly identical
schemas with slightly different properties end up in the same hash bucket? How often will
this actually happen? I can't imagine this would significantly impact hashing performance.

Equal objects should have equal hashcodes, but equal hashcodes don't imply equal objects

> Cache hash codes in Schema and Field
> ------------------------------------
>
>                 Key: AVRO-853
>                 URL: https://issues.apache.org/jira/browse/AVRO-853
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.5.1
>            Reporter: Douglas Kaminsky
>         Attachments: AVRO-853-approach2.patch, AVRO-853.patch
>
>
> We are experiencing a serious performance degradation when trying to store/retrieve fields
and schemas in hash-based data structures (eg. HashMap). Since all fields and schemas are
immutable (with the exception of RecordSchema allowing deferred setting of Fields) it makes
sense to cache the hash code on the object instead of recalculating every time the hashCode
method gets called. 
> (Are there other mutable Schema sub-types that I'm not thinking about?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message