db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2256) Wrong Results: Use of decimal values in an IN-list with INTEGER left operand can lead to extra rows.
Date Tue, 27 Mar 2007 17:37:32 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484514
] 

A B commented on DERBY-2256:
----------------------------

> Did you consider having this query return an error? If the user provides a
> floating point value for an integer column, wouldn't they prefer to have an
> error message, rather than have the database quietly return zero rows?

Well first I should point out that "quietly return[ing] zero rows" is not the correct behavior
in the example you gave.  That's what Derby currently does (before the patch for this issue)
but that is not correct.  The correct behavior is to return a single row matching the "4".

Given that, the question becomes "Did you consider having this query return an error?"  And
this is a great question--thank you for bringing it up.

The short answer is No, I didn't consider throwing an error--but that was just because it
didn't occur to me ;)  The longer and more relevant answer is that, after looking at the SQL
2003 spec for IN lists, my own reading is that we should not throw an error in the example
that you mentioned.  Here's why...(feel free to skip if you're not interested):

----

>From SQL 2003 spec, 8.4 <in predicate> grammar shows:

  <in predicate> ::= <row value predicand> <in predicate part 2>
  <in predicate part 2> ::= [ NOT ] IN <in predicate value>
  <in value list> ::= <row value expression> [ { <comma> <row value expression>
}... ]
  <in predicate value> ::=
       <table subquery>
       | <left paren> <in value list> <right paren>

Note that <in value list> in this grammar does not include parentheses. Now if we look
at syntax Rule #2 in the same section, we see:

<begin quote>

  Let IVL be an <in value list>.

    ( IVL )

  is equivalent to the <table value constructor>:

    ( VALUES IVL )

<end quote>

So if our example query is:

  select * from t1 where i in (4.23, 4);

then <in value list> is simply "4.23, 4", and thus

    ( IVL )  ==> ( 4.23, 4 )

  is equivalent to the <table value constructor>:

    ( VALUES 4.23, 4 )

This particular VALUES clause is legal in Derby--i.e. we do not throw an error:

ij> values 4.23, 4;
1
---------------
4.23
4.00

2 rows selected

The next question is whether or not Derby *should* throw an error here.  The specs for a VALUES
clause are given in section 7.3 as "<table value constructor>" and include the following:

 Section 7.3, Syntax Rule #4:

  The row type of TVC is determined by applying Subclause 9.3, "Data types of results
  of aggregations", to the row types [of the values in the list].

If we go on to look at subclause 9.3 we see the following rule:

 Section 9.3, Syntax Rule #2:

  All of the data types in DTS shall be comparable.

In our example we have an integer column and a decimal value, and those two types are indeed
comparable with each other.  So we satisfy rule #2 and therefore should not throw an error
here.

That said, we now go back to section 8.4 (IN predicate) and look at syntax rules #3 and #5,
where we see the following:

<begin quote>

 3) Let RVC be the <row value predicand> and let IPV be the <in predicate value>.

 5) The expression

     RVC IN IPV

    is equivalent to

     RVC = ANY IPV

<end quote>

In our example RVC is the column "i" and IPV is "(4.23, 4)".  So then we have:

    i IN (4.23, 4)

  is equivalent to

    i = ANY (4.23, 4)

Then if we use syntax rule #2 again, we end up with:

    i = ANY (VALUES 4.23, 4)

Putting that back into our original query we now have:

   select * from t1 where i = any (values 4.23, 4);

This latter query executes without error in Derby and returns 1 row (even before the patch
for this issue is applied):

ij> select * from t1 where i = any (values 4.23, 4);
I
-----------
4

1 row selected

Since the type of the VALUES clause is decimal (section 9.3, rule #3), we are doing a comparison
of an integer with a decimal.  Such a comparison should not throw an error and should only
return true if the values are algebraically equal.  Thus the above query is correctly returning
a single row--and that's what the equivalent IN list query should be doing, too.

----

Putting all of that together, my conclusion is that we should not throw an error if if the
user specifies a decimal or floating point value for an integer column.  Whether or not the
user would "prefer" an error I can't say, but standards-wise I think the correct thing is
to compare using the dominant type and only return the rows that match.

Feel free to comment if I've overlooked something or otherwise misread the spec.  Spec-reading
is *not* one of my gifts...

> Wrong Results: Use of decimal values in an IN-list with INTEGER left operand can lead
to extra rows.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2256
>                 URL: https://issues.apache.org/jira/browse/DERBY-2256
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.0, 10.0.2.1, 10.0.2.2, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.1.3.2,
10.1.4.0, 10.2.1.6, 10.2.2.0, 10.2.2.1, 10.2.3.0, 10.3.0.0
>            Reporter: A B
>         Assigned To: A B
>         Attachments: d2256_v1.patch, d2256_v1.stat
>
>
> While trying out some code changes for DERBY-47 I was running a few test cases and happened
to notice that there are a couple of cases in which Derby behaves incorrectly (that or my
understanding of what should be happening here is way off).
> First and most simply: the following query should return zero rows (unless I'm missing
something?), but it returns one:
> ij> create table t1 (i int);
> 0 rows inserted/updated/deleted
> ij> insert into t1 values 1, 2, 3, 4, 5;
> 5 rows inserted/updated/deleted
> -- Correct returns zero rows.
> ij> select * from t1 where i in (4.23);
> I
> -----------
> 0 rows selected
> -- But this one returns 1 row...
> ij> select * from t1 where i in (2.8, 4.23);
> I
> -----------
> 4
> 1 row selected
> Secondly, if the IN-list contains a non-constant value then Derby can incorrectly return
rows that do not match the IN predicate.  I think this is because some internal casting is
happening when it shouldn't?
> ij> create table t1 (i int);
> 0 rows inserted/updated/deleted
> ij> insert into t1 values 1, 2, 3, 4, 5;
> 5 rows inserted/updated/deleted
> -- Following values clause returns "2.80", as expected.
> ij> values cast (2.8 as decimal(4, 2));
> 1
> -------
> 2.80
> 1 row selected
> -- But if we use it in an IN-list it gets cast to "2" and thus returns a match.
> -- We get 2 rows when we should get NONE.
> ij> select * from t1 where i in (cast (2.8 as decimal(4, 2)), 4.23);
> I
> -----------
> 2
> 4
> 2 rows selected
> I confirmed that we see these results on trunk, 10.2, 10.1, and even as far back as svn
#201660 for 10.0.  I also ran the above statements on DB2 v8 as a sanity check to confirm
that NO results were returned there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message