db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bernt M. Johnsen" <Bernt.John...@Sun.COM>
Subject Re: Collation and string literals limitation in SQL standard?
Date Wed, 06 Jun 2007 09:04:42 GMT
>>>>>>>>>>>> Daniel John Debrunner wrote (2007-06-04 15:33:21):
> Following the SQL standard the collation derivation of a string literal 
> and a column reference are both "implicit". This is what the current 
> collation feature is implementing.
> 
> This leads to some interesting cases once Derby supports column specific 
> collations.
> 
> Take:
> 
> create table T(NAME CHAR(10) COLLATE X);
> 
> Now the boolean expression in a query
> 
>    NAME = 'Fred'
> 
> will fail if the collation for 'Fred' is not X. That could exist with 
> the current implementation (literals default to the collation of the 
> current schema) or the original proposal (literals default to the 
> database's collation).

A literal's collation is (as stated) implicit. But I interpret the
standard like this:

The derived type of 'Fred' in the expression

SELECT NAME FROM T WHERE NAME = 'Fred'

is the type of NAME which is CHAR(10) and since the collation
derivation is implicit, 'Fred' has the collation X.

This follows from the folling sections in SQL 2003:

    3.1.6.7 declared type (of an expression denoting a value or anything
    that can be referenced to denote a value, such as, for example, a
    parameter, column, or variable): The unique data type that is common
    to every value that might result from evaluation of that expression.

    9.13 Collation determination [...]  e) Otherwise, every operand
    whose collation derivation is implicit shall have the same
    declared type collation IDTC and the collation to be used is IDTC.

That is, when the collation is implicit, it should be that of the
derived type. 

> It gets a little stranger if a comparison expression is used within the 
> create table:
> 
>  create table T(TYPE CHAR(10) COLLATE X CHECK TYPE IN ('CAR', 'PLANE', 
> 'BOAT');
> 
> Now the create table will fail.

Without having checked what happens with the derived type exactly for
this example, I think it is, due to 3.1.6.7 and 9.13, safe to assume
that the collation of 'CAR' etc here should be X.

> Both of these could be made to work with a COLLATE clause, but it 
> becomes especially strange in the CHECK clause, as the COLLATE clause 
> must now appear multiple times (I think).
> 
>  create table T(TYPE CHAR(10) COLLATE X CHECK TYPE IN ('CAR' COLLATE X, 
> 'PLANE' COLLATE X, 'BOAT' COLLATE X);
> 
> In both of these cases it seems (to me) that what the language really 
> intends is that collation to be used is that of the column reference.
> 
> Note that even without multiple user collations, Derby will have this 
> problem for queries that system columns and user schemas, e.g. 
> sys.systables.TABLENAME = 'MYTABLE' will fail if compiled in a user schema.
> 
> Interestingly both MySQL and Microsoft have solved this problem by 
> adding additional levels to the derivation concept. SQL Standard 
> supports three levels: NONE, IMPLICIT and EXPLICIT.
> 
> Microsoft has added one: Coercible-default between implicit and none.
> 
> http://msdn2.microsoft.com/en-us/library/ms179886.aspx
> 
> and string literals map to coercible-default, thus in the above examples 
> the column's collation will be used.
> 
> MySQL instead has a concept of "coercibility" with six levels, 0-5. This 
> is claimed to be SQL standard but seems to be from an earlier version of 
> the standard. I found another reference to it in some SQL book.
> 
> http://doc.51windows.net/mysql/?url=/mysql/ch10s05.html
> (section 10.5.4)
> 
> Again string literals are at a different level to column references 
> leading to column collation being used.
> 
> 
> Does anyone know of any correction in the SQL standard regarding this 
> issue? I haven't been able to find one.
> 
> I'm just thinking about the future and trying to ensure that anyone who 
> implements multiple user collations for Derby won't have to break 
> backwards compatibility to ensure rational behaviour.
> 
> Thanks,
> Dan.
> 

-- 
Bernt Marius Johnsen, Database Technology Group, 
Staff Engineer, Technical Lead Derby/Java DB
Sun Microsystems, Trondheim, Norway

Mime
View raw message