db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation feature discussion
Date Mon, 19 Mar 2007 19:19:42 GMT
Hi Dan,

You asked about how collation will be set for character expressions like
string literal, cast to character type of a character expression, trim,
concationation etc.

DTD will have an attribute called collation type and in 10.3, the possible
values for it will be -1 meaning UNKNOWN collation, 0 meaning UCS_BASIC and
1 meaning TERRITORY_BASED. By default, DTD's will have the collation type
set to UNKNOWN. If the DTD is for a user table's CHAR column, then DTD's
collation will be set to TERRIOTRY_BASED/UCS_BASIC depending on what was
requested at database create time in the jdbc url. This setting of collation
will be done by DTD.setCollationType(int). If the DTD is for a SYS schema
table's CHAR column, then  DTD's collation will be set to UCS_BASIC.

I think there is a DTD associated with all the character expressions like
string literal, cast to character type of a character expression, trim,
concationation etc. And since the default collation type is UNKNOWN, these
character expressions will have their collation type as UNKNOWN until they
actually get used in a collation method. When they get used in a collation
method, their collation type will be determined by the context in which they
are. ie if the other operand of the collation method has UCS_BASIC
associated with them, then the character expression's collation type in DTD
will get set to UCS_BASIC and similar logic if the other operand had
TERRITORY_BASED collation type associated with it.

I hope this answers your question. I will include this information on the
wiki page for DERBY-1478 so that everything is tracked in one central
location.

thanks,
Mamta


On 3/18/07, Daniel John Debrunner <djd@apache.org> wrote:
>
> Mike Matrigali wrote:
> > I'll let someone else summarize.  At this point I have
> > been convinced by Dan that his proposal is the best way
> > forward.  And by rick and dan that we should just go
> > ahead and store column level metadata for the collate
> > info in the store, as well as in the language level
> > per column metadata.
> >
> > The key points that convinced me are:
> > o Even though we are proposing a "single" collation per
> >   database, internally we need to support 2 per database to
> >   do the right thing for system catalogs.  Once there are
> >   2 we needed support in store to at the very least store
> >   metadata per conglomerate.
> >
> > o It looks like dan's proposal makes the runtime creation
> >   of the collated and non-collated objects easier.  I don't
> >   understand all the places this affects, but anything that
> >   makes this easier seems good to me.
>
> I think some design specification or notes would be really useful for
> collation. As Mike says the places where this has an impact are not well
> known, starting a list on a wiki page would be good, then others could
> look and ask if other areas are effected. E.g. I think the path we are
> heading down is that at create table or alter table add column time the
> collation for that column will be set in its DataTypeDescriptor, just
> like its nullability is today. Then at bind time when that column is
> referenced the collation type will be available through its DTD. But
> there are a host of other character expressions, it would be good to
> list these up front and how  the collation will be set, rather than
> discovering them one at time through coding (and missing some). E.g.
> What's the defined behaviour for:
>
>       string literal
>       cast to character type of a character expression
>       trim
>       concatenation
>       etc.
>
> Then some writeup of how store column collation information is to be
> stored (along with upgrade issues) would really help cement a good
> design up front.
>
> Thanks,
> Dan.
>
>

Mime
View raw message