db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6080) Cast to same type changes result with IN subquery
Date Fri, 13 Jun 2014 12:36:01 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030575#comment-14030575

Knut Anders Hatlen commented on DERBY-6080:

It looks like this is caused by an inconsistency between the equals(Object) and hashCode()
methods in the number data types, as can be seen by this code example:

        SQLDouble d = new SQLDouble((double) Long.MAX_VALUE);
        SQLLongint b = new SQLLongint(Long.MAX_VALUE - 1);
        System.out.println("d equals b: " + d.equals(b));
        System.out.println("b equals d: " + b.equals(d));
        System.out.println("b hashCode: " + b.hashCode());
        System.out.println("d hashCode: " + d.hashCode());

The above code causes the following to be printed:

d equals b: true
b equals d: true
b hashCode: -2147483647
d hashCode: -2147483648

The DOUBLE and BIGINT values are considered equal by the equals(Object) method, but hashCode()
returns different values. That clearly violates the contract of java.lang.Object.hashCode(),
which states that "\[if\] two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce the same integer result."

Because of this inconsistency, the nested loop join believes the two objects are equal, whereas
the hash join puts the two values in different buckets because of their different hash codes,
and considers them not equal.

The reason why the equals() method believes that they are equal, is that it performs the comparison
using the type that has the highest type precedence. For comparisons between DOUBLE and BIGINT,
DOUBLE has the highest precedence, so they are compared by using SQLDouble.typeCompare().
SQLDouble.typeCompare() calls getDouble() on both values, and loses precision so that both
9223372036854775806 and 9223372036854775807 end up as 9.223372036854776E18.

I'm not quite sure if the join should have returned zero, one or two rows. SQL:2011, part
2, 8.2 <comparison predicate>, GR 2 says: "Numbers are compared with respect to their
algebraic value."

Reasons for zero rows: Neither 9223372036854775806 nor 9223372036854775807 is algebraically
equal to 9.223372036854776E18, so the join condition should evaluate to false for both rows.

Reasons for one row: When casting 9.223372036854776E18 to a long/bigint, we get the value
9223372036854775807, so the difference is just that the printed representation of the double
is an approximation of the actual value.

Reasons for two rows: None, except if we could find evidence in the standard that it is correct
to convert both operands of such a comparison to DOUBLE first.

> Cast to same type changes result with IN subquery
> -------------------------------------------------
>                 Key: DERBY-6080
>                 URL: https://issues.apache.org/jira/browse/DERBY-6080
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:,,,,,,,,,,,,,,,,,,,,,
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>              Labels: derby_triage10_11
> Given these tables
> ij> create table t4 (b bigint);
> 0 rows inserted/updated/deleted
> ij> insert into t4 values 9223372036854775806, 9223372036854775807;
> 2 rows inserted/updated/deleted
> ij> create table t5 (d double);
> 0 rows inserted/updated/deleted
> ij> insert into t5 values 9.223372036854776E18;
> 1 row inserted/updated/deleted
> the following two SELECT queries return different results:
> ij> select * from t4 where b in (select d from t5);
> B                   
> --------------------
> 9223372036854775807 
> 1 row selected
> ij> select * from t4 where b in (select cast(d as double) from t5);
> B                   
> --------------------
> 9223372036854775806 
> 9223372036854775807 
> 2 rows selected
> Since the only difference is D vs CAST(D AS DOUBLE), and the type of D is already DOUBLE,
I'd expect the two queries to be equivalent.

This message was sent by Atlassian JIRA

View raw message