db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?
Date Wed, 21 Mar 2007 17:55:07 GMT


Mamta Satoor wrote:
> 2)At the time of upgrade of pre-10.3 database, we should make sure that 
> derby.database.collation property with value UCS_BASIC in added to 
> services.properties. This is because we do not plan on supporting 
> collation change for existing databases.
Is this required?  How does the code handle a soft upgrade database 
where this property is not set?  Could you say what you plan to do
in both the hard and soft upgrade cases?

I was assuming that only new databases would be affected and that 
somehow new code would just work on existing databases with no upgrade
changes at all.  So something like no collation property at all
would be interpreted as UCS_BASIC.  And of course old format SYSCOLUMN
entries would be valid as well as old format conglomerate store metadata.
> 
> 
> On 3/20/07, *Mamta Satoor* <msatoor@gmail.com 
> <mailto:msatoor@gmail.com>> wrote:
> 
>     Thanks, Mike and Dan for your responses. Based on this and following
>     from Dan's first mail in this thread
>     ******start of part of Dan's first mail in this thread*******
>     - basic database sets the locale for the DataValueFactory after it
>     boots it, using a new method on DVF
>             void setLocale(Locale locale);
>     ******end of part of Dan's first mail in this thread*******
I may have missed this, is locale information already available from
from services.properties?  For the store boot issue store will provide
format id and collation id, but I believe you need locale information
to determine the RuleBasedCollator and it can't depend on anything in
the property conglomerate.

>     we donot need the collation attribute information at the DVF boot
>     time. It is sufficient to have locale info set on DVF at the boot
>     time using setLocale method by basic database. If store code calls
>     DVF to give proper DVD using formatid and collation type, DVF can
>     determine the correct RuleBasedCollator using the locale if the
>     collation type is territory based. So, DVF has everything it needs
>     to find the correct RuleBasedCollator for given collation type.
>      
>     I will go ahead and remove the following requirement from
>     Outstanding items under
>     http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>     <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478>
>     1)Add jdbc url attribute COLLATION into services.properties as
>     derby.database.collation property. If no COLLATION is specified at
>     database create time, then have UCS_BASIC as the value for
>     derby.database.collation We need the property in the
>     services.properties rather than properties conglomerate because
>     DataValueFactory <http://wiki.apache.org/db-derby/DataValueFactory>
>     needs this property before store has been booted completely.
>      
>     In addition, I will add an entry as follows under Implemented Items
>     on
>     http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>     <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478>
>     At the time of database create time, optional JDBC url attribute
>     COLLATION is validated by the boot code in data dictionary and the
>     validated value of COLLATION(if none specified by user, then it will
>     default to UCS_BASIC which is also the only collation available on
>     pre-10.3 databases) attribute is saved as derby.database.collation
>     property in the properties conglomerate. This work was done by
>     revision 511283
>      
>     As always, any feedback is welcomed,
>     Mamta
>      
>     On 3/20/07, *Mike Matrigali* <mikem_app@sbcglobal.net
>     <mailto:mikem_app@sbcglobal.net>> wrote:
> 
> 
> 
>         Mamta Satoor wrote:
>>  Mike, I am not sure if your question, about how in store DVD with
>>  correction collation type is loaded, was answered or not. In
>         other
>>  words, you had question about following piece of pseudo code
>         from Dan
>>      if (dvd instanceof StringDataValue)
>>              dvd = dvd.getValue(dvf.getCharacterCollator(type));
>>
>>  Let me attempt to answer it. It will help clear up things in
>         my mind too
>>  and make sure that I am understanding this correctly.
>>
>>  Currently,
>>
>         derby.impl.dtore.access.conglomerate.OpenConglomerateScratchSpace
>         has
>>  get_row_for_export which first gets a class template row using
>>  RowUtil.newClassInfoTemplate This method in RowUtil calls
>>  Monitor.classFromIdentifier to get the InstanceGetter for each
>         of the
>>  format ids identified by store. Once
>>  OpenConglomerateScratchSpace.get_row_for_export has the class
>         template
>>  row, it will call RowUtil.newRowFromClassInfoTemplate. This is the
>>  method, Dan is proposing to modify, ie store should pass an
>         additional
>>  array of int to  RowUtil.newRowFromClassInfoTemplate which
>         will have the
>>  collation type associated with the formatids of the template row.
>>  RowUtil.newRowFromClassInfoTemplate will first get the DVD as
>         it does
>>  today using following
>>                     columns[column_index] =
>>  (DataValueDescriptor)
>         classinfo_template[column_index].getNewInstance();
>>  In addition, it will need to do something like following
>>      if (columns[column_index] instanceof StringDataValue)
>>              dvd =
>>
>         columns[column_index].getValue(dvf.getCharacterCollator(collationTypesForTemplateRows[column_index]));
> 
>         My opinion is that this work should be done in the datavalue
>         factory and
>         not outside.  Dan suggested at one point that some of the work of
>         generating classes/instances should move from Monitor to
>         datavalue factory.
> 
>         So I was assuming something like RowUtil.newClassInfoTemplate
>         instead
>         of calling Monitor.classFromIdentifier(format_ids[i]) get an
>         array of
>         InstanceGetter's, it would call something like
>         datavaluefactory.classFromIdentifier(format_ids[i],
>         collator_ids[i]) -
>         then every InstanceGetter would produce the right type with
>         collator set
>         from then on.
> 
> 
>         Internal to dvf it can do the work of checking for instanceof if it
>         needs to, but because it is inside dvf maybe it can do something
>         smarter .
>>
>>  Dan, let me know if I understood you right. This will help me
>         answer
>>  your question on the Derby wiki page
>>
>         http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>         <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478>
>>  <
>         http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>         <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478>>
>         I
>>  know that we don't need to get into the implementation code
>         details in
>>  the design phase, but I need to be able to picture this
>         particular case
>>  in my mind to understand where I am going.
>>
>>  thanks,
>>  Mamta
>>
>>
>>  On 3/15/07, *Mike Matrigali* < mikem_app@sbcglobal.net
>         <mailto:mikem_app@sbcglobal.net>
>>  <mailto:mikem_app@sbcglobal.net
>         <mailto:mikem_app@sbcglobal.net>>> wrote:
>>
>>
>>
>>     Daniel John Debrunner wrote:
>>      > Mamta Satoor wrote:
>>      >
>>     ...
>>
>>      >
>>      > - At recovery time the btree uses the collation type and
>         the data
>>     value
>>      > factory to setup its template row array correctly.
>         Something like
>>      >      for each dvd in row array
>>      >         if (dvd instanceof StringDataValue)
>>      >              dvd = dvd.getValue(dvf.getCharacterCollator
>         (type));
>>
>>     Note that the store issue is not just a recovery time
>         issue, templates
>>     are required during normal runtime.  Creation of these
>         templates used
>>     to show up (a long time ago) in performance analysis and
>         work was done
>>     to optimize the performance.  So I am interested in making
>         these
>>     template creations as efficient as possible.
>>
>>     Your proposal above does not look right to me - it could
>         just be I don't
>>     understand where the psuedo code is.  The code I expect in
>         store would
>>     be something like below - letting the datafactory do
>         whatever is right
>>     based on the format id and the collation, if store is going
>         to "own"
>>     knowing
>>     the collation of a given column then I would expect
>         something like:
>>
>>     for each format id in row array
>>         dvd = datavaluefactory.getObject(format id,
>         character_collator_type)
>>
>>     note this means extra overhead for every object creation in
>         the
>>     template.
>>
>>     To me it seems unfortunate to pass in this info per column,
>         when at
>>     least in 10.3 the current code it is one per database.  I
>         saw the
>>     direction as:
>>
>>     o 10.3 only needs one collation per database so hide the
>         info in the
>>       datafactory, basically there is one DEFAULT collation per
>         database.
>>       Thus no need for second argument to
>         datavaluefactory.getObject ()
>>
>>     o future release needs to have different collations per
>         conglomerate,
>>       then at that time we can store a collator type per
>         conglomerate - we
>>       have mechanism today to upgrade on the fly.  If we want
>         to support
>>       adding a collation to an existing database I would
>         suggest continueing
>>       the DEFAULT collation concept with some magic number
>         representing
>>       DEFAULT db collation in the datavaluefactory.getObject ()
>         call - which
>>       would mean use db wide default rather than specify
>         specific one. For
>>       new databases we would not need default, we could at that
>         time
>>     specify
>>       one per conglomerate.
>>       At this point we either change all the
>         datavaluefactory.getObject()
>>       calls to have 2 args and support DEFAULT_VALUE as second
>         argument, or
>>       maybe support both 1 and 2 arg calls - not sure.
>>
>>     0 future future release needs to have different collations
>         per column,
>>       then at that time we can store a collator type per column
>         - we
>>     continue to have mechanism to upgrade on fly as long as we
>         can come up
>>     with a default value for old tables.  Same issues as above.
>>
>>
>>
>>      >
>>      > - setting the collation property remains in the data
>         dictionary
>>      >
>>      > - basic database sets the locale for the
>         DataValueFactory after
>>     it boots
>>      > it, using a new method on DVF
>>      >         void setLocale(Locale locale);
>>      >
>>      > I think approaching the problem this way will lead to a
>         cleaner
>>     solution
>>      > in the long term and be somewhat easier to implement.
>>      >
>>      > Thanks,
>>      > Dan.
>>      >
>>      >
>>      >
>>      >
>>      >
>>      >
>>
>>
> 
> 
> 


Mime
View raw message