db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Collation and string literals limitation in SQL standard?
Date Tue, 05 Jun 2007 16:29:33 GMT


Daniel John Debrunner wrote:
> Following the SQL standard the collation derivation of a string literal 
> and a column reference are both "implicit". This is what the current 
> collation feature is implementing.
> 
> This leads to some interesting cases once Derby supports column specific 
> collations.
> 
> Take:
> 
> create table T(NAME CHAR(10) COLLATE X);
> 
> Now the boolean expression in a query
> 
>    NAME = 'Fred'
> 
> will fail if the collation for 'Fred' is not X. That could exist with 
> the current implementation (literals default to the collation of the 
> current schema) or the original proposal (literals default to the 
> database's collation).
> 
> It gets a little stranger if a comparison expression is used within the 
> create table:
> 
>  create table T(TYPE CHAR(10) COLLATE X CHECK TYPE IN ('CAR', 'PLANE', 
> 'BOAT');
> 
> Now the create table will fail.
> 
> Both of these could be made to work with a COLLATE clause, but it 
> becomes especially strange in the CHECK clause, as the COLLATE clause 
> must now appear multiple times (I think).
> 
>  create table T(TYPE CHAR(10) COLLATE X CHECK TYPE IN ('CAR' COLLATE X, 
> 'PLANE' COLLATE X, 'BOAT' COLLATE X);
> 
> In both of these cases it seems (to me) that what the language really 
> intends is that collation to be used is that of the column reference.
> 
> Note that even without multiple user collations, Derby will have this 
> problem for queries that system columns and user schemas, e.g. 
> sys.systables.TABLENAME = 'MYTABLE' will fail if compiled in a user schema.
> 
> Interestingly both MySQL and Microsoft have solved this problem by 
> adding additional levels to the derivation concept. SQL Standard 
> supports three levels: NONE, IMPLICIT and EXPLICIT.
> 
> Microsoft has added one: Coercible-default between implicit and none.
> 
> http://msdn2.microsoft.com/en-us/library/ms179886.aspx
> 
> and string literals map to coercible-default, thus in the above examples 
> the column's collation will be used.
> 
> MySQL instead has a concept of "coercibility" with six levels, 0-5. This 
> is claimed to be SQL standard but seems to be from an earlier version of 
> the standard. I found another reference to it in some SQL book.
> 
> http://doc.51windows.net/mysql/?url=/mysql/ch10s05.html
> (section 10.5.4)
> 
> Again string literals are at a different level to column references 
> leading to column collation being used.
I don't know any standard reference.  Both the mysql and oracle rules 
seem more user friendly with regard at least to comparison between 
constants and implicit collation references (columns being the case
I usually think of affecting the most usage).
> 
> 
> Does anyone know of any correction in the SQL standard regarding this 
> issue? I haven't been able to find one.
> 
> I'm just thinking about the future and trying to ensure that anyone who 
> implements multiple user collations for Derby won't have to break 
> backwards compatibility to ensure rational behaviour.
> 
> Thanks,
> Dan.
> 
> 

Mime
View raw message