db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation feature discussion
Date Thu, 22 Mar 2007 18:49:56 GMT
Before talking about functions, I think it will be better to first talk
about string literals and their collation determination.

SQL spec section 5.3 <literal>, Syntax Rule 15) says "The declared type
collation of a <character string literal> is the character set collation,
and the collation derivation is implicit."

Based on this, when a string literal (collation type UNKNOWN) is getting
used in a collation method with another operand as UCS_BASIC collation, then
the collation type of string literal will be UCS_BASIC. Similar rule for
operand with TERRITORY_BASED. In a case where, collation types of all the
operands is UNKNOWN, at collation time, it can be assumed to be whatever is
defined for user defined character columns. This will be similar to the
example given by Rick for implicit collation type when talking about CAST ie

CREATE TABLE t1 (c11 char(1) default 'a') In this example, the collation
type of DTD associated with 'a' will be implicitly whatever is defined at
the database level for COLLATION.

Hope this answers the question about string literals.
Mamta


On 3/22/07, Mamta Satoor <msatoor@gmail.com> wrote:
>
> Dan asked following on http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>
> *****************************
>
> 7)As for the character expressions involving CAST, TRIM, UPPER, LOWER,
> SUBSTRING, CONCATENATE, the result character datatype will have the same
> collation type as their operands.
>
>    -
>
>    *Questions* What about other character expressions, such as
>    functions? What happens when CONCATENATE is passed two values with different
>    collations?
>
> *****************************
> For the CONCATENATE question, as per SQL spec Section 9.3, Syntax Rules
> 2), all of the datatypes shall be comparable. I take that to mean that two
> values with different collations are not comparable and hence such a
> CONCATENATE will fail. I will update the wiki page with this information.
>
> As for the function with character expression, let me write another mail.
>
> Mamta
>
>
>  On 3/19/07, Mamta Satoor <msatoor@gmail.com> wrote:
> >
> > Dan, I looked at the SQL 2002 foundation specification.
> >
> > TRIM, UPPER, LOWER, SUBSTRING functions are covered in Section 6.29<string value
function> and it says that these functions will get the
> > collation of their operand. For instance, Syntax Rule (4b) says this for
> > SUBSTRING function "The character set and collation of the <character
> > substring function> are those of DTCVE." DTCVE is the declared type of the
> > <character value expression>. Same thing is implied for UPPER and LOWER
> > functions (Syntax Rule 8), TRIM function (Syntax Rule 11).
> >
> > I am not too clear on what happens when CAST is used. SQL spec discusses
> > this in section 6.12 <cast specification> and says in Syntax Rule 10)
> > that "The declared type collation of the <cast specification> is the
> > character set collation of the character set of TD and its collation
> > derivation is implicit." So for the eg case CAST(charC1 as
> > VARCHAR(30)), what will be the collation of VARHCHAR(30) value? Is Derby's
> > character set's collation is UCS_BASIC? If so, then will the CAST value (if
> > casted to one of the character datatypes) always have collation of UCS_BASIC
> > no matter if charC1 has UCS_BASIC / TERRITORY_BASED collation?
> >
> > As for concatenation, Section 9.3 Data types of results of aggregations
> > has Sytax Rules 2)All of the data types shall be comparable. I take that to
> > mean that userChar1WithTerritoryBasedCollation can't be concatenated with
> > systemChar1 because systemChar1 has UCS_BASIC collation. But if the
> > datatypes are comparable, then the result of concatenation will have the
> > collation type of the operands.
> >
> > I hope I covered it all. Looking forward to feedback,
> > Mamta
> >
> > On 3/19/07, Daniel John Debrunner <djd@apache.org > wrote:
> > >
> > > Mamta Satoor wrote:
> > > > Hi Dan,
> > > >
> > > > You asked about how collation will be set for character expressions
> > > like
> > > > string literal, cast to character type of a character expression,
> > > trim,
> > > > concationation etc.
> > > >
> > > > DTD will have an attribute called collation type and in 10.3, the
> > > > possible values for it will be -1 meaning UNKNOWN collation, 0
> > > meaning
> > > > UCS_BASIC and 1 meaning TERRITORY_BASED. By default, DTD's will have
> > > the
> > > > collation type set to UNKNOWN. If the DTD is for a user table's CHAR
> > > > column, then DTD's collation will be set to
> > > TERRIOTRY_BASED/UCS_BASIC
> > > > depending on what was requested at database create time in the jdbc
> > > url.
> > > > This setting of collation will be done by DTD.setCollationType(int).
> > > If
> > > > the DTD is for a SYS schema table's CHAR column, then  DTD's
> > > collation
> > > > will be set to UCS_BASIC.
> > > >
> > > > I think there is a DTD associated with all the character expressions
> > > > like string literal, cast to character type of a character
> > > expression,
> > > > trim, concationation etc. And since the default collation type is
> > > > UNKNOWN, these character expressions will have their collation type
> > > as
> > > > UNKNOWN until they actually get used in a collation method. When
> > > they
> > > > get used in a collation method, their collation type will be
> > > determined
> > > > by the context in which they are. ie if the other operand of the
> > > > collation method has UCS_BASIC associated with them, then the
> > > character
> > > > expression's collation type in DTD will get set to UCS_BASIC and
> > > similar
> > > > logic if the other operand had TERRITORY_BASED collation type
> > > associated
> > > > with it.
> > > >
> > > > I hope this answers your question. I will include this information
> > > on
> > > > the wiki page for DERBY-1478 so that everything is tracked in one
> > > > central location.
> > >
> > > I'm not sure it's a simple as that. Consider this expression:
> > >
> > >   TRIM(x) < 'eee'
> > >
> > > If x is a user column then I would expect the collation to be
> > > performed
> > > using the collation for user columns, but if x was a system column I
> > > would expect it to be performed using UCS_BASIC. I think it would be
> > > much like nullability, the nullability of some operation of x is
> > > dependent on the nullability of x, hence the collation of some
> > > operation
> > > of x would be dependent on the collation of x.
> > > Though of course, maybe the SQL standard defines it differently, it
> > > would be good to know if it's defined by the standard or left as
> > > implementation defined.
> > >
> > > Dan.
> > >
> > >
> >
>

Mime
View raw message