db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?
Date Thu, 15 Mar 2007 17:37:43 GMT
I think I am missing/not understanding your direction.

are there still 4 new types?

/mikem

Daniel John Debrunner wrote:
> Mamta Satoor wrote:
> 
>> Ok, so I spent some time trying to move COLLATION attribute code from 
>> DataDictionaryImpl.boot to DataValueFactoryImpl.boot. I thought I 
>> could simply put following piece of code in DataValueFactoryImpl.boot 
>> method and the Property.COLLATION will get saved in the properties 
>> conglomerate.
> 
> 
> I think some of this goes back to the intended implementation.
> 
> The intended implementation seems to be that there will be variants of 
> the four character datatypes with locale based collation. This is four 
> new (internal) datatypes in Derby that share most code with the existing 
> CHAR, VARCHAR, LONG VARCHHAR and CLOB types.
> 
> I'm not sure this is the correct approach.
> 
> My first thought is that this doesn't scale and doesn't seem like an OO 
> solution. To think ahead this means any addition collation style will 
> also add four new datatypes, which means there could easily be sixteen 
> or more datatypes to represent the characters. Each datatype will come 
> with some code cost, classes and/or methods per type.
> 
> My second concern is that many places get characters and the change must 
> ensure they get the correct datatype, apart from potentially being a lot 
> of work, the chance of missing some or picking the wrong character types 
> seems high.
> 
> What is really required is 'character type + collation'. I've been 
> thinking that looking at the problem in this way may make it more 
> manageable and easier to contain, with the main idea being only worry 
> about collation type when actually performing a collation. So some 
> initial ideas:
> 
> - collation is a attribute of DataTypeDescriptor, not valid for non 
> character types, 0 for UCS_BASIC, 1 for UNICODE etc.
>        int getCollationType();
> 
> - A method on DataValueFactory, returns null if type is UCS_BASIC
>        RuleBasedCollator getCharacterCollator(int type)
> 
> - A method on StringDataValue
>        StringDataValue getValue(RuleBasedCollator collator)
> 
>        For SQLChar:
>             getValue(null) would return itself
>             getValue(non-null) would return a new CollateSQLChar() with 
> the value of the SQLChar and the collator set.
> 
>        For CollatorSQLChar
>            getValue(null) would return a new SQLChar() with the value of 
> the CollateSQLChar
>            getValue(non-null) would return itself with the collator set 
> correctly.
> 
> - The collation type (the integer) is written into the meta-data for an 
> index just as ascending/descending is today (including the btree control 
> row, thus making the information available for recovery). Collation type 
> applies to all character columns in the index.
> 
> - At SQL collation time, the code generation sets up the various types 
> correctly using the new methods.
> 
> - At recovery time the btree uses the collation type and the data value 
> factory to setup its template row array correctly. Something like
>      for each dvd in row array
>         if (dvd instanceof StringDataValue)
>              dvd = dvd.getValue(dvf.getCharacterCollator(type));


> 
> - setting the collation property remains in the data dictionary
> 
> - basic database sets the locale for the DataValueFactory after it boots 
> it, using a new method on DVF
>         void setLocale(Locale locale);
> 
> I think approaching the problem this way will lead to a cleaner solution 
> in the long term and be somewhat easier to implement.
> 
> Thanks,
> Dan.
> 
> 
> 
> 
> 
> 


Mime
View raw message