db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation feature discussion
Date Mon, 19 Mar 2007 21:32:35 GMT
Dan, I looked at the SQL 2002 foundation specification.

TRIM, UPPER, LOWER, SUBSTRING functions are covered in Section 6.29 <string
value function> and it says that these functions will get the collation of
their operand. For instance, Syntax Rule (4b) says this for SUBSTRING
function "The character set and collation of the <character substring
function> are those of DTCVE." DTCVE is the declared type of the <character
value expression>. Same thing is implied for UPPER and LOWER functions
(Syntax Rule 8), TRIM function (Syntax Rule 11).

I am not too clear on what happens when CAST is used. SQL spec discusses
this in section 6.12 <cast specification> and says in Syntax Rule 10) that
"The declared type collation of the <cast specification> is the character
set collation of the character set of TD and its collation derivation is
implicit." So for the eg case CAST(charC1 as VARCHAR(30)), what will be the
collation of VARHCHAR(30) value? Is Derby's character set's collation is
UCS_BASIC? If so, then will the CAST value (if casted to one of the
character datatypes) always have collation of UCS_BASIC no matter if charC1

As for concatenation, Section 9.3 Data types of results of aggregations has
Sytax Rules 2)All of the data types shall be comparable. I take that to mean
that userChar1WithTerritoryBasedCollation can't be concatenated with
systemChar1 because systemChar1 has UCS_BASIC collation. But if the
datatypes are comparable, then the result of concatenation will have the
collation type of the operands.

I hope I covered it all. Looking forward to feedback,

On 3/19/07, Daniel John Debrunner <djd@apache.org> wrote:
> Mamta Satoor wrote:
> > Hi Dan,
> >
> > You asked about how collation will be set for character expressions like
> > string literal, cast to character type of a character expression, trim,
> > concationation etc.
> >
> > DTD will have an attribute called collation type and in 10.3, the
> > possible values for it will be -1 meaning UNKNOWN collation, 0 meaning
> > UCS_BASIC and 1 meaning TERRITORY_BASED. By default, DTD's will have the
> > collation type set to UNKNOWN. If the DTD is for a user table's CHAR
> > column, then DTD's collation will be set to TERRIOTRY_BASED/UCS_BASIC
> > depending on what was requested at database create time in the jdbc url.
> > This setting of collation will be done by DTD.setCollationType(int). If
> > the DTD is for a SYS schema table's CHAR column, then  DTD's collation
> > will be set to UCS_BASIC.
> >
> > I think there is a DTD associated with all the character expressions
> > like string literal, cast to character type of a character expression,
> > trim, concationation etc. And since the default collation type is
> > UNKNOWN, these character expressions will have their collation type as
> > UNKNOWN until they actually get used in a collation method. When they
> > get used in a collation method, their collation type will be determined
> > by the context in which they are. ie if the other operand of the
> > collation method has UCS_BASIC associated with them, then the character
> > expression's collation type in DTD will get set to UCS_BASIC and similar
> > logic if the other operand had TERRITORY_BASED collation type associated
> > with it.
> >
> > I hope this answers your question. I will include this information on
> > the wiki page for DERBY-1478 so that everything is tracked in one
> > central location.
> I'm not sure it's a simple as that. Consider this expression:
>   TRIM(x) < 'eee'
> If x is a user column then I would expect the collation to be performed
> using the collation for user columns, but if x was a system column I
> would expect it to be performed using UCS_BASIC. I think it would be
> much like nullability, the nullability of some operation of x is
> dependent on the nullability of x, hence the collation of some operation
> of x would be dependent on the collation of x.
> Though of course, maybe the SQL standard defines it differently, it
> would be good to know if it's defined by the standard or left as
> implementation defined.
> Dan.

View raw message