Return-Path: X-Original-To: apmail-db-derby-dev-archive@www.apache.org Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AE9ADE3E1 for ; Wed, 19 Dec 2012 13:13:17 +0000 (UTC) Received: (qmail 673 invoked by uid 500); 19 Dec 2012 13:13:17 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 99684 invoked by uid 500); 19 Dec 2012 13:13:15 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 99625 invoked by uid 99); 19 Dec 2012 13:13:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Dec 2012 13:13:13 +0000 Date: Wed, 19 Dec 2012 13:13:13 +0000 (UTC) From: "Knut Anders Hatlen (JIRA)" To: derby-dev@db.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DERBY-6017) IN lists with constants may return wrong results MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DERBY-6017?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1353= 5934#comment-13535934 ]=20 Knut Anders Hatlen commented on DERBY-6017: ------------------------------------------- I've tried to interpret what the standard says. Here are the relevant parts= I've found: > 8.4 - Syntax Rules > > 2) Let IVL be an . > ( IVL ) > is equivalent to the : > ( VALUES IVL ) So, according to this rule, the following two queries should be equivalent = (which they are not currently): ij> select * from t where x in (9223372036854775805, 9223372036854775806, 9= 223372036854775807, 9.223372036854776E18); X =20 -------------------- 9223372036854775805=20 1 row selected ij> select * from t where x in (values 9223372036854775805, 922337203685477= 5806, 9223372036854775807, 9.223372036854776E18); X =20 -------------------- 9223372036854775805=20 9223372036854775806=20 9223372036854775807=20 3 rows selected Furthermore, it says: > 8.4 - Syntax Rules > > 5) The expression > RVC IN IPV > is equivalent to > RVC =3D ANY IPV So to find the correct semantics for IN, we need to rewrite the query to AN= Y. That is, select * from t where x =3D any (values 9223372036854775805, 92233720368547= 75806, 9223372036854775807, 9.223372036854776E18); and see what the standard says about that. (This particular ANY query retur= ns three rows in Derby, which is the same as the IN (VALUES ...) query abov= e.) This leads us to: > 8.8 - Syntax Rules > > 1) Let RV1 and RV2 be s whose declared types are res= pectively that of the predicand> and the row type of the
. The Syntax Rules of = Subclause 8.2, =E2=80=9C predicate>=E2=80=9D, are applied to: > RV1 RV2 That is, for the comparisons, the value on the right hand side should have = the row type of the sub-query. And the row type of our VALUES sub-query is DOUBLE (or at least some approx= imate numeric type) as 7.3
says row type is deter= mined by applying Subclause 9.3, =E2=80=9CData types of results of aggregat= ions=E2=80=9D, whose syntax rule 3d says: > If any data type in DTS is approximate numeric, then each data type in DT= S shall be numeric and the > result data type is approximate numeric with implementation-defined preci= sion. Derby does produce the right type for the
: ij> values 9223372036854775805, 9223372036854775806, 9223372036854775807, 9= .223372036854776E18; 1 =20 ---------------------- 9.223372036854776E18 =20 9.223372036854776E18 =20 9.223372036854776E18 =20 9.223372036854776E18 =20 4 rows selected The ANY query should therefore end up like: select * from t where x =3D 9.223372036854776E18 or x =3D 9.2233720368547= 76E18 or x =3D 9.223372036854776E18 or x =3D 9.223372036854776E18; Or even simpler, because the DOUBLE representation of all four values happe= ns to be the same: select * from t where x =3D 9.223372036854776E18; Now, 8.2 - General Rules, says this: > 2) Numbers are compared with respect to their algebraic value. No more details than that, I'm afraid. And no mentioning about converting t= he operands to the dominant type, so far as I can see. Derby currently returns these three rows for the query: ij> select * from t where x =3D 9.223372036854776E18; X =20 -------------------- 9223372036854775805=20 9223372036854775806=20 9223372036854775807=20 3 rows selected I'm not completely convinced that all those three values have the same alge= braic value as 9.223372036854776E18. But in any case I think changing how D= erby performs numeric comparisons is outside the scope of this issue. So how's this for a plan? In this issue, let's assume Derby's equality comp= arison operator does the right thing. The goal for now should be to make an= behave the same way as the ANY query the SQL standard says= it should be equivalent to. We should have tests that use the results from= the equivalent ANY queries as canons, and those tests would also alert us = if we later make changes to the comparison operator in a way that makes ANY= and IN behave inconsistently. =20 > IN lists with constants may return wrong results > ------------------------------------------------ > > Key: DERBY-6017 > URL: https://issues.apache.org/jira/browse/DERBY-6017 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.9.1.0 > Reporter: Knut Anders Hatlen > Assignee: Knut Anders Hatlen > > Given this table: > ij> connect 'jdbc:derby:memory:db;create=3Dtrue'; > ij> create table t(x bigint); > 0 rows inserted/updated/deleted > ij> insert into t values 9223372036854775805, 9223372036854775806, 922337= 2036854775807; > 3 rows inserted/updated/deleted > A query that uses an IN list that contains all the three values actually = stored in the table, returns all three rows as expected: > ij> select * from t where x in (9223372036854775805, 9223372036854775806,= 9223372036854775807); > X =20 > -------------------- > 9223372036854775805=20 > 9223372036854775806=20 > 9223372036854775807=20 > 3 rows selected > However, if we add a value whose type precedence is higher, like a DOUBLE= value, and that value happens to be equal to the approximation of the othe= r values in the IN list when they are cast from BIGINT to DOUBLE, only one = row is returned: > ij> select * from t where x in (9223372036854775805, 9223372036854775806,= 9223372036854775807, 9.223372036854776E18); > X =20 > -------------------- > 9223372036854775805=20 > 1 row selected > I believe this query should return all three rows too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira