db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Collation info of internally generated CAST node
Date Thu, 07 Jun 2007 05:41:15 GMT

As per the wiki page
Section Collation Determination, Rule 4), result of CAST will take the
collation of the current compilation schema. This is what Derby
10.3codeline has implemented for CAST in the
CastNode.bindCastNodeOnly() method.

But I am not sure if that is the right thing to do for CAST nodes that get
generated internally. I know of one specific example where I don't think the
rule of collation seeting for CAST is working very well. Consider following

create table t (id int, type char(10), typeVarchar varchar(10));
insert into t values (1,'CAR','CAR'),(2,'SUV','SUV');
set schema sys;
SELECT  type FROM app.t WHERE CASE WHEN 1=1 THEN type ELSE typevarchar END =
type; -- the sql in question

Note that the DTD associated with THEN clause expression is of type CHAR and
the DTD associated with ELSE clause expression is of type VARCHAR. And in
Derby, VARCHAR has higher type precedence than CHAR.

 Now, during the compilation of the SELECT statement above, the
ConditionalNode.bindExpression makes following call which causes
ConditionalNode to have a DTD which has same properties as the DTD of ELSE
clause expression which is of type VARCHAR(since VARCHAR has higher type
precedence than CHAR) with collation type of territory based and collation
derivation of IMPLICIT. So far, so good.

Later, the ConditionalNode.bindExpression has following if statement which
will return true for our specific SELECT statement
  if (thenTypeId.typePrecedence() != condTypeId.typePrecedence())
This is because the datatype(CHAR) of "type" in THEN clause does not have
same type precedence as datatype(VARCHAR) of ConditionalNode and so the code
inside the if statement in ConditionalNode.bindExpression generates a CAST
node on the top of the THEN clause expression and that CAST node uses the *SAME
physical DTD* of the ConditionalNode, which in this case is a VARCHAR
datatype with collation type of territory based and collation derivation of
IMPLICIT. Next, ConditionalNode.bindExpression calls bind on the newly
created cast node using following
   cast = cast.bindExpression(fromList,
During the bind of the CAST, we always have the CAST node take the collation
of the current compilation schema, which in this case is SYS and hence we
end up assigining collation type of UCS_BASIC to DTD associated with the
CAST node.. But since the CAST is associated with the same physical DTD that
is used by the ConditionalNode, the ConditionalNode ends up having it's
collation type changed from territory based to UCS_BASIC and this causes the
above SELECT statement to fail at compilation time because of mismatch of
collation type between CASE... = type. The left hand side of CASE... = type
ends up having collation of UCS_BASIC whereas right hand side "type" has
collation type of territory based and hence the SELECT compilation fails. I
think this behavior is not correct. The CASE node should have hold on to
it's collation type of territory based. One solution may be to not have CAST
node and ConditionalNode share the same physical DTD. This solution might be
easier to implement. Other solution is to have the *internally* generated
CAST nodes not take the collation type of the compilation schema but I am
not sure if that solution would work for all internally generated CAST nodes
and if there is a way to differentiate between internally generated CAST
nodes and those specified by the end user. I have not spent time going
through all the internally generated CAST nodes to see if the 2nd solution
will make sense or not.

I will appreciate others feedback on this.


View raw message