db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3981) Improve distribution of hash codes in SQLBinary and SQLChar
Date Fri, 02 Jan 2009 10:23:45 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660290#action_12660290

Knut Anders Hatlen commented on DERBY-3981:

Committed the performance test with revision 730689.

> Improve distribution of hash codes in SQLBinary and SQLChar
> -----------------------------------------------------------
>                 Key: DERBY-3981
>                 URL: https://issues.apache.org/jira/browse/DERBY-3981
>             Project: Derby
>          Issue Type: Improvement
>          Components: Newcomer, Performance, SQL
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: distinct-test.diff
> SQLBinary.hashCode() and SQLChar.hashCode() use a very simple algorithm that just takes
the sum of the values in the array. This gives a poor distribution of hash values because
similar values will have a higher probability of mapping to the same hash code, and the higher
bits won't be used unless the array is very long. We should change these methods so that they
use an algorithm similar to the one used in java.lang.String.hashCode(), described here: <URL:http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode()>.
This may have a positive effect on the performance of hash scans as it will reduce the likelihood
of collisions in the hash table.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message