db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Collation feature discussion
Date Fri, 23 Mar 2007 13:49:04 GMT
Hi Mamta,

This is my understanding of what these words mean, based on a quick 
googling of industry practices. For instance, see


Explicit - This means that a COLLATE clause in the statement forces the 
server to use a particular collation.

Implicit - This means that your statement mentions a column without 
using a COLLATE clause. The column itself has a collation which was 
determined when the table was created.

None - This case arises when you use SQL operators to combine two 
columns which have different collations. For example "select 
frenchColumn || englishColumn from ...". In this case the server cannot 
figure out which collation to use.

There is also a concept of a default collation for a datatype. As I read 
the SQL standard, part 2, section 4.2.2, I see the following:

1) A string datatype has a default character set associated with it.

2) That character set, in turn, has a distinguished collation associated 
with it.

3) That collation is the default collation of the string datatype. That 
is, if you create a column of that datatype and you don't include a 
COLLATE clause, then the column has that collation. Similarly, if you 
declare a function that returns a string datatype and you don't include 
a COLLATE clause, then the function returns a string having that default 

Hope this helps...


Mamta Satoor wrote:
> I am looking at the SQL spec to see how it deals with the problem of 
> different collation types, which they call as explicit, implicit and 
> none. Hopefully, that will make it easier to come up with a logic for 
> deducting correct collation type for non-trivial cases like COLLATE, 
> TRIM, string literal, etc.
> Mamta
> On 3/22/07, *Daniel John Debrunner* <djd@apache.org 
> <mailto:djd@apache.org>> wrote:
>     Mamta Satoor wrote:
>     > Before talking about functions, I think it will be better to
>     first talk
>     > about string literals and their collation determination.
>     >
>     > SQL spec section 5.3 <literal>, Syntax Rule 15) says "The
>     declared type
>     > collation of a <character string literal> is the character set
>     > collation, and the collation derivation is implicit."
>     >
>     > Based on this, when a string literal (collation type UNKNOWN) is
>     getting
>     > used in a collation method with another operand as UCS_BASIC
>     collation,
>     > then the collation type of string literal will be UCS_BASIC.
>     Similar
>     > rule for operand with TERRITORY_BASED. In a case where,
>     collation types
>     > of all the operands is UNKNOWN, at collation time, it can be
>     assumed to
>     > be whatever is defined for user defined character columns. This
>     will be
>     > similar to the example given by Rick for implicit collation type
>     when
>     > talking about CAST ie
>     > CREATE TABLE t1 (c11 char(1) default 'a') In this example, the
>     collation
>     > type of DTD associated with 'a' will be implicitly whatever is
>     defined
>     > at the database level for COLLATION.
>     >
>     > Hope this answers the question about string literals.
>     Kind of, I looked up the definition of "collation derivation is
>     implicit" in section 4.2.2 of the standard and at first reading it
>     wasn't obvious to me what it meant.
>     I know I suggested the 'collation type UNKNOWN' but I hadn't
>     looked into
>     the SQL standard in detail, and now I'm wondering if the UNKNOWN
>     concept is a good idea. Since the SQL standard already defines a model
>     for how collations are defined it might be wise to follow the required
>     model and naming. Not sure what that would mean exactly, but it seems
>     like each character expression can have a derivation of explicit,
>     implicit or none. These may be better ways to carry state rather than
>     unknown. Unless of course there's a clear mapping between unknown and
>     the sql standard definition.
>     Dan.

View raw message