db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6017) IN lists with constants may return wrong results
Date Wed, 19 Dec 2012 13:13:13 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535934#comment-13535934
] 

Knut Anders Hatlen commented on DERBY-6017:
-------------------------------------------

I've tried to interpret what the standard says. Here are the relevant parts I've found:

> 8.4 <in predicate> - Syntax Rules
>
> 2) Let IVL be an <in value list>.
> ( IVL )
> is equivalent to the <table value constructor>:
> ( VALUES IVL )

So, according to this rule, the following two queries should be equivalent (which they are
not currently):

ij> select * from t where x in (9223372036854775805, 9223372036854775806, 9223372036854775807,
9.223372036854776E18);
X                   
--------------------
9223372036854775805 

1 row selected
ij> select * from t where x in (values 9223372036854775805, 9223372036854775806, 9223372036854775807,
9.223372036854776E18);
X                   
--------------------
9223372036854775805 
9223372036854775806 
9223372036854775807 

3 rows selected

Furthermore, it says:

> 8.4 <in predicate> - Syntax Rules
>
> 5) The expression
> RVC IN IPV
> is equivalent to
> RVC = ANY IPV

So to find the correct semantics for IN, we need to rewrite the query to ANY. That is,

select * from t where x = any (values 9223372036854775805, 9223372036854775806, 9223372036854775807,
9.223372036854776E18);

and see what the standard says about that. (This particular ANY query returns three rows in
Derby, which is the same as the IN (VALUES ...) query above.)

This leads us to:

> 8.8 <quantified comparison predicate> - Syntax Rules
>
> 1) Let RV1 and RV2 be <row value predicand>s whose declared types are respectively
that of the <row value
> predicand> and the row type of the <table subquery>. The Syntax Rules of Subclause
8.2, “<comparison
> predicate>”, are applied to:
> RV1 <comp op> RV2

That is, for the comparisons, the value on the right hand side should have the row type of
the sub-query.

And the row type of our VALUES sub-query is DOUBLE (or at least some approximate numeric type)
as 7.3 <table value constructor> says row type is determined by applying Subclause 9.3,
“Data types of results of aggregations”, whose syntax rule 3d says:

> If any data type in DTS is approximate numeric, then each data type in DTS shall be numeric
and the
> result data type is approximate numeric with implementation-defined precision.

Derby does produce the right type for the <table value constructor>:

ij> values 9223372036854775805, 9223372036854775806, 9223372036854775807, 9.223372036854776E18;
1                     
----------------------
9.223372036854776E18  
9.223372036854776E18  
9.223372036854776E18  
9.223372036854776E18  

4 rows selected

The ANY query should therefore end up like:

  select * from t where x = 9.223372036854776E18 or x = 9.223372036854776E18 or x = 9.223372036854776E18
or x = 9.223372036854776E18;

Or even simpler, because the DOUBLE representation of all four values happens to be the same:

  select * from t where x = 9.223372036854776E18;

Now, 8.2 <comparison predicate> - General Rules, says this:

> 2) Numbers are compared with respect to their algebraic value.

No more details than that, I'm afraid. And no mentioning about converting the operands to
the dominant type, so far as I can see.

Derby currently returns these three rows for the query:

ij> select * from t where x = 9.223372036854776E18;
X                   
--------------------
9223372036854775805 
9223372036854775806 
9223372036854775807 

3 rows selected

I'm not completely convinced that all those three values have the same algebraic value as
9.223372036854776E18. But in any case I think changing how Derby performs numeric comparisons
is outside the scope of this issue.

So how's this for a plan? In this issue, let's assume Derby's equality comparison operator
does the right thing. The goal for now should be to make an <in value list> behave the
same way as the ANY query the SQL standard says it should be equivalent to. We should have
tests that use the results from the equivalent ANY queries as canons, and those tests would
also alert us if we later make changes to the comparison operator in a way that makes ANY
and IN behave inconsistently.
                
> IN lists with constants may return wrong results
> ------------------------------------------------
>
>                 Key: DERBY-6017
>                 URL: https://issues.apache.org/jira/browse/DERBY-6017
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.9.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>
> Given this table:
> ij> connect 'jdbc:derby:memory:db;create=true';
> ij> create table t(x bigint);
> 0 rows inserted/updated/deleted
> ij> insert into t values 9223372036854775805, 9223372036854775806, 9223372036854775807;
> 3 rows inserted/updated/deleted
> A query that uses an IN list that contains all the three values actually stored in the
table, returns all three rows as expected:
> ij> select * from t where x in (9223372036854775805, 9223372036854775806, 9223372036854775807);
> X                   
> --------------------
> 9223372036854775805 
> 9223372036854775806 
> 9223372036854775807 
> 3 rows selected
> However, if we add a value whose type precedence is higher, like a DOUBLE value, and
that value happens to be equal to the approximation of the other values in the IN list when
they are cast from BIGINT to DOUBLE, only one row is returned:
> ij> select * from t where x in (9223372036854775805, 9223372036854775806, 9223372036854775807,
9.223372036854776E18);
> X                   
> --------------------
> 9223372036854775805 
> 1 row selected
> I believe this query should return all three rows too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message