db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Collation feature discussion
Date Mon, 19 Mar 2007 23:42:41 GMT
Mamta Satoor wrote:
> Thanks for your response, Rick. I did notice that but as part of 
> DERBY-1478, I wasn't planning on adding support for the SQL clause 
> "COLLATE". So, I am thinking, in the absence of the SQL clause 
> COLLATE, how will the CAST work.
>  
> thanks,
> Mamta
Hi Mamta,

It's a little muddy. Here's another angle on the problem. What would be 
the collation of the varchar column in the following declaration:

create table foo ( characterCol varchar( 20 ) )

I would guess that if that varchar column has an implicit collation, 
then that is the same collation that would be bound to the target 
datatype of your CAST statement below.

Does that sound reasonable?

Regards,
-Rick
>
>
>  
> On 3/19/07, *Rick Hillegas* <Richard.Hillegas@sun.com 
> <mailto:Richard.Hillegas@sun.com>> wrote:
>
>     Mamta Satoor wrote:
>     > Dan, I looked at the SQL 2002 foundation specification.
>     >
>     > TRIM, UPPER, LOWER, SUBSTRING functions are covered in Section 6.29
>     > <string value function> and it says that these functions will
>     get the
>     > collation of their operand. For instance, Syntax Rule (4b) says this
>     > for SUBSTRING function "The character set and collation of the
>     > <character substring function> are those of DTCVE." DTCVE is the
>     > declared type of the <character value expression>. Same thing is
>     > implied for UPPER and LOWER functions (Syntax Rule 8), TRIM
>     function
>     > (Syntax Rule 11).
>     >
>     > I am not too clear on what happens when CAST is used. SQL spec
>     > discusses this in section 6.12 <cast specification> and says in
>     Syntax
>     > Rule 10) that "The declared type collation of the <cast
>     specification>
>     > is the character set collation of the character set of TD and its
>     > collation derivation is implicit." So for the eg case CAST(charC1 as
>     > VARCHAR(30)), what will be the collation of VARHCHAR(30) value? Is
>     > Derby's character set's collation is UCS_BASIC? If so, then will the
>     > CAST value (if casted to one of the character datatypes) always have
>     > collation of UCS_BASIC no matter if charC1 has UCS_BASIC /
>     > TERRITORY_BASED collation?
>     Hi Mamta,
>
>     You've probably seen this in section 6.1: as part of the CAST, you can
>     specify the collation you want. E.g.:
>
>     CAST englishColumn as VARCHAR(300) COLLATE germanCollation
>
>     Regards,
>     -Rick
>     >
>     > As for concatenation, Section 9.3 Data types of results of
>     > aggregations has Sytax Rules 2)All of the data types shall be
>     > comparable. I take that to mean that
>     > userChar1WithTerritoryBasedCollation can't be concatenated with
>     > systemChar1 because systemChar1 has UCS_BASIC collation. But if the
>     > datatypes are comparable, then the result of concatenation will have
>     > the collation type of the operands.
>     >
>     > I hope I covered it all. Looking forward to feedback,
>     > Mamta
>     >
>     > On 3/19/07, *Daniel John Debrunner* <djd@apache.org
>     <mailto:djd@apache.org>
>     > <mailto:djd@apache.org <mailto:djd@apache.org>>> wrote:
>     >
>     >     Mamta Satoor wrote:
>     >     > Hi Dan,
>     >     >
>     >     > You asked about how collation will be set for character
>     >     expressions like
>     >     > string literal, cast to character type of a character
>     >     expression, trim,
>     >     > concationation etc.
>     >     >
>     >     > DTD will have an attribute called collation type and in
>     10.3, the
>     >     > possible values for it will be -1 meaning UNKNOWN collation, 0
>     >     meaning
>     >     > UCS_BASIC and 1 meaning TERRITORY_BASED. By default, DTD's
>     will
>     >     have the
>     >     > collation type set to UNKNOWN. If the DTD is for a user
>     table's CHAR
>     >     > column, then DTD's collation will be set to
>     >     TERRIOTRY_BASED/UCS_BASIC
>     >     > depending on what was requested at database create time in the
>     >     jdbc url.
>     >     > This setting of collation will be done by
>     >     DTD.setCollationType (int). If
>     >     > the DTD is for a SYS schema table's CHAR column, then  DTD's
>     >     collation
>     >     > will be set to UCS_BASIC.
>     >     >
>     >     > I think there is a DTD associated with all the character
>     expressions
>     >     > like string literal, cast to character type of a character
>     >     expression,
>     >     > trim, concationation etc. And since the default collation
>     type is
>     >     > UNKNOWN, these character expressions will have their
>     collation
>     >     type as
>     >     > UNKNOWN until they actually get used in a collation
>     method. When
>     >     they
>     >     > get used in a collation method, their collation type will be
>     >     determined
>     >     > by the context in which they are. ie if the other operand
>     of the
>     >     > collation method has UCS_BASIC associated with them, then the
>     >     character
>     >     > expression's collation type in DTD will get set to
>     UCS_BASIC and
>     >     similar
>     >     > logic if the other operand had TERRITORY_BASED collation type
>     >     associated
>     >     > with it.
>     >     >
>     >     > I hope this answers your question. I will include this
>     >     information on
>     >     > the wiki page for DERBY-1478 so that everything is tracked
>     in one
>     >     > central location.
>     >
>     >     I'm not sure it's a simple as that. Consider this expression:
>     >
>     >       TRIM(x) < 'eee'
>     >
>     >     If x is a user column then I would expect the collation to be
>     >     performed
>     >     using the collation for user columns, but if x was a system
>     column I
>     >     would expect it to be performed using UCS_BASIC. I think it
>     would be
>     >     much like nullability, the nullability of some operation of x is
>     >     dependent on the nullability of x, hence the collation of some
>     >     operation
>     >     of x would be dependent on the collation of x.
>     >     Though of course, maybe the SQL standard defines it
>     differently, it
>     >     would be good to know if it's defined by the standard or
>     left as
>     >     implementation defined.
>     >
>     >     Dan.
>     >
>     >
>
>


Mime
View raw message