db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation feature discussion
Date Thu, 22 Mar 2007 16:49:32 GMT
Dan asked following on
http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
*****************************

7)As for the character expressions involving CAST, TRIM, UPPER, LOWER,
SUBSTRING, CONCATENATE, the result character datatype will have the same
collation type as their operands.

   -

   *Questions* What about other character expressions, such as functions?
   What happens when CONCATENATE is passed two values with different
   collations?

*****************************
For the CONCATENATE question, as per SQL spec Section 9.3, Syntax Rules 2),
all of the datatypes shall be comparable. I take that to mean that two
values with different collations are not comparable and hence such a
CONCATENATE will fail. I will update the wiki page with this information.

As for the function with character expression, let me write another mail.

Mamta


On 3/19/07, Mamta Satoor <msatoor@gmail.com> wrote:
>
> Dan, I looked at the SQL 2002 foundation specification.
>
> TRIM, UPPER, LOWER, SUBSTRING functions are covered in Section 6.29<string value function>
and it says that these functions will get the
> collation of their operand. For instance, Syntax Rule (4b) says this for
> SUBSTRING function "The character set and collation of the <character
> substring function> are those of DTCVE." DTCVE is the declared type of the
> <character value expression>. Same thing is implied for UPPER and LOWER
> functions (Syntax Rule 8), TRIM function (Syntax Rule 11).
>
> I am not too clear on what happens when CAST is used. SQL spec discusses
> this in section 6.12 <cast specification> and says in Syntax Rule 10) that
> "The declared type collation of the <cast specification> is the character
> set collation of the character set of TD and its collation derivation is
> implicit." So for the eg case CAST(charC1 as VARCHAR(30)), what will be the
> collation of VARHCHAR(30) value? Is Derby's character set's collation is
> UCS_BASIC? If so, then will the CAST value (if casted to one of the
> character datatypes) always have collation of UCS_BASIC no matter if charC1
> has UCS_BASIC / TERRITORY_BASED collation?
>
> As for concatenation, Section 9.3 Data types of results of aggregations
> has Sytax Rules 2)All of the data types shall be comparable. I take that to
> mean that userChar1WithTerritoryBasedCollation can't be concatenated with
> systemChar1 because systemChar1 has UCS_BASIC collation. But if the
> datatypes are comparable, then the result of concatenation will have the
> collation type of the operands.
>
> I hope I covered it all. Looking forward to feedback,
> Mamta
>
> On 3/19/07, Daniel John Debrunner <djd@apache.org> wrote:
> >
> > Mamta Satoor wrote:
> > > Hi Dan,
> > >
> > > You asked about how collation will be set for character expressions
> > like
> > > string literal, cast to character type of a character expression,
> > trim,
> > > concationation etc.
> > >
> > > DTD will have an attribute called collation type and in 10.3, the
> > > possible values for it will be -1 meaning UNKNOWN collation, 0 meaning
> >
> > > UCS_BASIC and 1 meaning TERRITORY_BASED. By default, DTD's will have
> > the
> > > collation type set to UNKNOWN. If the DTD is for a user table's CHAR
> > > column, then DTD's collation will be set to TERRIOTRY_BASED/UCS_BASIC
> > > depending on what was requested at database create time in the jdbc
> > url.
> > > This setting of collation will be done by DTD.setCollationType(int).
> > If
> > > the DTD is for a SYS schema table's CHAR column, then  DTD's collation
> >
> > > will be set to UCS_BASIC.
> > >
> > > I think there is a DTD associated with all the character expressions
> > > like string literal, cast to character type of a character expression,
> > > trim, concationation etc. And since the default collation type is
> > > UNKNOWN, these character expressions will have their collation type as
> > > UNKNOWN until they actually get used in a collation method. When they
> > > get used in a collation method, their collation type will be
> > determined
> > > by the context in which they are. ie if the other operand of the
> > > collation method has UCS_BASIC associated with them, then the
> > character
> > > expression's collation type in DTD will get set to UCS_BASIC and
> > similar
> > > logic if the other operand had TERRITORY_BASED collation type
> > associated
> > > with it.
> > >
> > > I hope this answers your question. I will include this information on
> > > the wiki page for DERBY-1478 so that everything is tracked in one
> > > central location.
> >
> > I'm not sure it's a simple as that. Consider this expression:
> >
> >   TRIM(x) < 'eee'
> >
> > If x is a user column then I would expect the collation to be performed
> > using the collation for user columns, but if x was a system column I
> > would expect it to be performed using UCS_BASIC. I think it would be
> > much like nullability, the nullability of some operation of x is
> > dependent on the nullability of x, hence the collation of some operation
> >
> > of x would be dependent on the collation of x.
> > Though of course, maybe the SQL standard defines it differently, it
> > would be good to know if it's defined by the standard or left as
> > implementation defined.
> >
> > Dan.
> >
> >
>

Mime
View raw message