Return-Path: X-Original-To: apmail-db-derby-dev-archive@www.apache.org Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A26B77850 for ; Wed, 7 Dec 2011 05:49:11 +0000 (UTC) Received: (qmail 81694 invoked by uid 500); 7 Dec 2011 05:49:11 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 81571 invoked by uid 500); 7 Dec 2011 05:49:07 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 81550 invoked by uid 99); 7 Dec 2011 05:49:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 05:49:05 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 05:49:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4F10A1043DA for ; Wed, 7 Dec 2011 05:48:40 +0000 (UTC) Date: Wed, 7 Dec 2011 05:48:40 +0000 (UTC) From: "Mamta A. Satoor (Commented) (JIRA)" To: derby-dev@db.apache.org Message-ID: <2135786042.48419.1323236920325.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (DERBY-4631) Wrong join column returned by right outer join with NATURAL or USING and territory-based collation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-4631?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1316= 4165#comment-13164165 ]=20 Mamta A. Satoor commented on DERBY-4631: ---------------------------------------- I debugged the code to figure out what (and where) are we doing in the code= which causes us to give wrong results for join column in case of territory= based database and right outer join with NATURAL or USING clause. As Knut = pointed out earlier in this jira, as per the SQL spec, "the join columns in= a natural join or in a named columns join should be added to the select li= st by coalescing the column from the left table with the column from the ri= ght table. " What I have found is that Derby decides to pick up join column= 's value from the left table when we are working with natural left outer jo= in and it picks up the join column's value from the right table when we are= working with natural right outer join. This is not a problem when we are d= ealing with non-territory based databases but the assumption to rely on jus= t one table's join column is incorrect when working with territory based da= tabases. Following is the test case I used for debugging which further expl= ains Derby's current implementation. connect 'jdbc:derby:db1;create=3Dtrue;territory=3Den_US;collation=3DTERRITO= RY_BASED:SECONDARY';=20 create table big(x varchar(5));=20 insert into big values 'A','B', null;=20 create table small(x varchar(5));=20 insert into small values 'b','c', null;=20 select * from small t1 natural left outer join big t2; select * from small t1 natural right outer join big t2; For both natural left outer join and natural right outer join, at execution= time, we create a merged row which has columns merged from the left and ri= ght tables. The column(in my example, there is only one column)s in the sel= ect sql maps to a column from the merged row. The mapping is determined at = the sql compile phase.=20 In the case of 'select * from small t1 natural left outer join big t2', the= re will be three merged rows with 2 columns each 'b' 'B' 'c' null null null And for natural left outer join, the generated code has column in the selec= t SQL map to the first column in the merged row. This will always work fine= even in a territory based database because as per the SQL standards, the c= olumn x should be equivalent to the return value of coalesce(t1.x, t2.x). S= ince we are working with left outer join, then if the first column in the m= erged row is null, then even the 2nd column(from the right table) will be n= ull and hence it is ok to always pick up the value from the 1st column in t= he merged row. This mapping will always lead column x to have the same valu= e as coalesce(t1.x, t2.x). But for a territory based database, we can't count on a logic like that for= natural right outer join. The way Derby works right now, the column x in t= he select always gets mapped to the 2nd column in the merged row. In the ca= se of 'select * from small t1 natural right outer join big t2', there will = be three merged rows with 2 columns each null 'A' 'b' 'B' null null And for natural right outer join, the generated code has column in the sele= ct SQL map to the second column in the merged row. This will work fine in a= non-territory database, because if column 1 in the merged row has a non-nu= ll value, then it will always be the same value as the column 2 in the merg= ed row. But in our example, with territor based database(with SECONDARY str= ength, meaning it is case insensitive comparison), values 'B' and 'b' are c= onsidered equal. Hence the coalesce(t1.x, t2,x) will not be same as value i= n the 2nd column of the merged row. For natural right outer join with the d= ata given in the example above,=20 coalesce(t1.x, t2,x) will return 'A', 'b' and null. But with the mapping of= column x in the SELECT to the 2nd column in the merged row will return 'A'= , 'B' and null thus returning data which does not comply with SQL standard = which says that column x's value should be the return value of coalesce(t1.= x, t2.x). So it seems like may be we need some of kind projection in case o= f natural right outer join (rather than simple column mapping to the 2nd co= lumn which is what happens right now) so that we look at both the columns i= n the merged row to determine the value of column x. Hope this explanation helps understand what Derby is doing internally and b= ased on that, we can come up with some proposal to fix the issue. =20 > Wrong join column returned by right outer join with NATURAL or USING and = territory-based collation > -------------------------------------------------------------------------= ------------------------- > > Key: DERBY-4631 > URL: https://issues.apache.org/jira/browse/DERBY-4631 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.6.1.0 > Reporter: Knut Anders Hatlen > Labels: derby_triage10_8 > > SQL:2003 says that the join columns in a natural join or in a named > columns join should be added to the select list by coalescing the > column from the left table with the column from the right table. > Section 7.7, , syntax rules: > > 1) Let TR1 be the first , and let TR2 be the
> reference> or
that is the second operand of the > > . Let RT1 and RT2 be the row types of TR1 and TR2, > > respectively. Let TA and TB be the range variables of TR1 and TR2, > > respectively. (...) > and > > 7) If NATURAL is specified or if a immediately > > containing a is specified, then: > (...) > > d) If there is at least one corresponding join column, then let SLCC > > be a