db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-4631) Wrong join column returned by right outer join with NATURAL or USING and territory-based collation
Date Wed, 18 Jan 2012 22:10:43 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188739#comment-13188739

Mamta A. Satoor commented on DERBY-4631:

With solution 1), we will need to know where the current code generation happens for join
columns. Solution 1) requires that we change the code generation to following logic.
1)if the left table's join column value is null then pick up the right table's join column's
2)If the left table's join column value is non-null, then pick up that value 

Following is what I found while looking for code generation logic for the join columns.

In the bind phase of a query, we start looking at result columns and assigning virutal column
numbers to them (impl.sql.compile.ResultColumn:virtualColumnId). These virtual column ids
are used to find mapping for those columns into runtime resultset for the query.

connect 'jdbc:derby:db1;create=true;territory=en_US;collation=TERRITORY_BASED:SECONDARY';

CREATE TABLE derby4631_t1(x varchar(5)); 
INSERT INTO derby4631_t1 VALUES 'A','B'; 
CREATE TABLE derby4631_t2(x varchar(5)); 
INSERT INTO derby4631_t2 VALUES 'b','c'; 
SELECT x FROM derby4631_t2 NATURAL RIGHT OUTER JOIN derby4631_t1; 

For the NATURAL JOIN query above, an equi-join is generated which will be derby4631_t2.x =
derby4631_t1.x. For this equi join, the join column x from left table(derby4631_t2) will get
virtual column id of 1 and join column x from right table(derby4631_t1) will get virtual column
id of 2. The column x in the SELECT query in Derby today gets associated with right table
in case of RIGHT OUTER JOIN and hence that column's virtual column id is also 2. At the time
of code generation, we use this information to decide what column position from the run time
resultset would be used to get the values. Following is the stack trace of where the code
generation for join column happens.

Thread [main] (Suspended (breakpoint at line 1421 in ProjectRestrictNode))	
	ProjectRestrictNode.generateMinion(ExpressionClassBuilder, MethodBuilder, boolean) line:
	ProjectRestrictNode.generate(ActivationClassBuilder, MethodBuilder) line: 1334	
	ScrollInsensitiveResultSetNode.generate(ActivationClassBuilder, MethodBuilder) line: 109

	CursorNode.generate(ActivationClassBuilder, MethodBuilder) line: 641	
	CursorNode(StatementNode).generate(ByteArray) line: 345	
	GenericStatement.prepMinion(LanguageConnectionContext, boolean, Object[], SchemaDescriptor,
boolean) line: 518	
	GenericStatement.prepare(LanguageConnectionContext, boolean) line: 97	
	GenericLanguageConnectionContext.prepareInternalStatement(SchemaDescriptor, String, boolean,
boolean) line: 1103	
	EmbedStatement40(EmbedStatement).execute(String, boolean, boolean, int, int[], String[])
line: 610	
	EmbedStatement40(EmbedStatement).execute(String) line: 559	
	ij.executeImmediate(String) line: 367	
	utilMain.doCatch(String) line: 527	
	utilMain.runScriptGuts() line: 369	
	utilMain.go(LocalizedInput[], LocalizedOutput) line: 245	
	Main.go(LocalizedInput, LocalizedOutput) line: 229	
	Main.mainCore(String[], Main) line: 184	
	Main.main(String[]) line: 75	
	ij.main(String[]) line: 59	
> Wrong join column returned by right outer join with NATURAL or USING and territory-based
> --------------------------------------------------------------------------------------------------
>                 Key: DERBY-4631
>                 URL: https://issues.apache.org/jira/browse/DERBY-4631
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Assignee: Mamta A. Satoor
>              Labels: derby_triage10_8
> SQL:2003 says that the join columns in a natural join or in a named
> columns join should be added to the select list by coalescing the
> column from the left table with the column from the right table.
> Section 7.7, <joined table>, syntax rules:
> > 1) Let TR1 be the first <table reference>, and let TR2 be the <table
> > reference> or <table factor> that is the second operand of the
> > <joined table>. Let RT1 and RT2 be the row types of TR1 and TR2,
> > respectively. Let TA and TB be the range variables of TR1 and TR2,
> > respectively. (...)
> and
> > 7) If NATURAL is specified or if a <join specification> immediately
> > containing a <named columns join> is specified, then:
> (...)
> > d) If there is at least one corresponding join column, then let SLCC
> > be a <select list> of <derived column>s of the form
> >
> >
> > for every column C that is a corresponding join column, taken in
> > order of their ordinal positions in RT1.
> For a right outer join, Derby doesn't use COALESCE(TA.C, TB.C), but
> rather just TB.C (the column in the right table) directly.
> This is in most cases OK, because COALESCE(TA.C, TB.C) = TB.C is an
> invariant in a right outer join. (Because TA.C is either NULL or equal
> to TB.C.)
> However, in a database with territory-based collation, equality
> between two values does not mean they are identical, especially now
> that the strength of the collator can be specified (DERBY-1748).
> Take for instance this join:
> ij> connect 'jdbc:derby:testdb;create=true;territory=en_US;collation=TERRITORY_BASED:SECONDARY';
> ij> create table big(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into big values 'A','B','C';
> 3 rows inserted/updated/deleted
> ij> create table small(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into small values 'b','c','d';
> 3 rows inserted/updated/deleted
> ij> select x, t1.x, t2.x, coalesce(t1.x, t2.x) from small t1 natural right outer join
big t2;
> X    |X    |X    |4    
> -----------------------
> A    |NULL |A    |A    
> B    |b    |B    |b    
> C    |c    |C    |c    
> 3 rows selected
> I believe that the expected result from the above query is that the
> first column should have the same values as the last column. That is,
> the first column should contain {'A', 'b', 'c'}, not {'A', 'B', 'C'}.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message