db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?
Date Wed, 21 Mar 2007 18:25:20 GMT
Actually, let me start by asking a store question. Is store going to write
the collation type column metadata only if the user has requested
TERRITORY_BASED collation ie can 2 newly created 10.3 databases have
different column metadata structure(ie with and without collation type
info) depending on whether the user has requested TERRITORY_BASED or
UCS_BASIC collation?

Mamta

On 3/21/07, Mike Matrigali <mikem_app@sbcglobal.net> wrote:
>
>
>
> Mamta Satoor wrote:
> > 2)At the time of upgrade of pre-10.3 database, we should make sure that
> > derby.database.collation property with value UCS_BASIC in added to
> > services.properties. This is because we do not plan on supporting
> > collation change for existing databases.
> Is this required?  How does the code handle a soft upgrade database
> where this property is not set?  Could you say what you plan to do
> in both the hard and soft upgrade cases?
>
> I was assuming that only new databases would be affected and that
> somehow new code would just work on existing databases with no upgrade
> changes at all.  So something like no collation property at all
> would be interpreted as UCS_BASIC.  And of course old format SYSCOLUMN
> entries would be valid as well as old format conglomerate store metadata.
> >
> >
> > On 3/20/07, *Mamta Satoor* <msatoor@gmail.com
> > <mailto:msatoor@gmail.com>> wrote:
> >
> >     Thanks, Mike and Dan for your responses. Based on this and following
>
> >     from Dan's first mail in this thread
> >     ******start of part of Dan's first mail in this thread*******
> >     - basic database sets the locale for the DataValueFactory after it
> >     boots it, using a new method on DVF
> >             void setLocale(Locale locale);
> >     ******end of part of Dan's first mail in this thread*******
> I may have missed this, is locale information already available from
> from services.properties ?  For the store boot issue store will provide
> format id and collation id, but I believe you need locale information
> to determine the RuleBasedCollator and it can't depend on anything in
> the property conglomerate.
>
> >     we donot need the collation attribute information at the DVF boot
> >     time. It is sufficient to have locale info set on DVF at the boot
> >     time using setLocale method by basic database. If store code calls
> >     DVF to give proper DVD using formatid and collation type, DVF can
> >     determine the correct RuleBasedCollator using the locale if the
> >     collation type is territory based. So, DVF has everything it needs
> >     to find the correct RuleBasedCollator for given collation type.
> >
> >     I will go ahead and remove the following requirement from
> >     Outstanding items under
> >
> http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >     <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >
> >     1)Add jdbc url attribute COLLATION into services.properties as
> >     derby.database.collation property. If no COLLATION is specified at
> >     database create time, then have UCS_BASIC as the value for
> >     derby.database.collation We need the property in the
> >     services.properties rather than properties conglomerate because
> >     DataValueFactory < http://wiki.apache.org/db-derby/DataValueFactory>
> >     needs this property before store has been booted completely.
> >
> >     In addition, I will add an entry as follows under Implemented Items
> >     on
> >
> http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >     <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >
> >     At the time of database create time, optional JDBC url attribute
> >     COLLATION is validated by the boot code in data dictionary and the
> >     validated value of COLLATION(if none specified by user, then it will
> >     default to UCS_BASIC which is also the only collation available on
> >     pre-10.3 databases) attribute is saved as derby.database.collation
> >     property in the properties conglomerate. This work was done by
> >     revision 511283
> >
> >     As always, any feedback is welcomed,
> >     Mamta
> >
> >     On 3/20/07, *Mike Matrigali* < mikem_app@sbcglobal.net
> >     <mailto:mikem_app@sbcglobal.net>> wrote:
> >
> >
> >
> >         Mamta Satoor wrote:
> >>  Mike, I am not sure if your question, about how in store DVD with
> >>  correction collation type is loaded, was answered or not. In
> >         other
> >>  words, you had question about following piece of pseudo code
> >         from Dan
> >>      if (dvd instanceof StringDataValue)
> >>              dvd = dvd.getValue(dvf.getCharacterCollator(type));
> >>
> >>  Let me attempt to answer it. It will help clear up things in
> >         my mind too
> >>  and make sure that I am understanding this correctly.
> >>
> >>  Currently,
> >>
> >
> derby.impl.dtore.access.conglomerate.OpenConglomerateScratchSpace
> >         has
> >>  get_row_for_export which first gets a class template row using
> >>  RowUtil.newClassInfoTemplate This method in RowUtil calls
> >>  Monitor.classFromIdentifier to get the InstanceGetter for each
> >         of the
> >>  format ids identified by store. Once
> >>  OpenConglomerateScratchSpace.get_row_for_export has the class
> >         template
> >>  row, it will call RowUtil.newRowFromClassInfoTemplate . This is the
> >>  method, Dan is proposing to modify, ie store should pass an
> >         additional
> >>  array of int to  RowUtil.newRowFromClassInfoTemplate which
> >         will have the
> >>  collation type associated with the formatids of the template row.
> >>  RowUtil.newRowFromClassInfoTemplate will first get the DVD as
> >         it does
> >>  today using following
> >>                     columns[column_index] =
> >>  (DataValueDescriptor)
> >         classinfo_template[column_index].getNewInstance();
> >>  In addition, it will need to do something like following
> >>      if (columns[column_index] instanceof StringDataValue)
> >>              dvd =
> >>
> >         columns[column_index].getValue(dvf.getCharacterCollator
> (collationTypesForTemplateRows[column_index]));
> >
> >         My opinion is that this work should be done in the datavalue
> >         factory and
> >         not outside.  Dan suggested at one point that some of the work
> of
> >         generating classes/instances should move from Monitor to
> >         datavalue factory.
> >
> >         So I was assuming something like RowUtil.newClassInfoTemplate
> >         instead
> >         of calling Monitor.classFromIdentifier(format_ids[i]) get an
> >         array of
> >         InstanceGetter's, it would call something like
> >         datavaluefactory.classFromIdentifier(format_ids[i],
> >         collator_ids[i]) -
> >         then every InstanceGetter would produce the right type with
> >         collator set
> >         from then on.
> >
> >
> >         Internal to dvf it can do the work of checking for instanceof if
> it
> >         needs to, but because it is inside dvf maybe it can do something
> >         smarter .
> >>
> >>  Dan, let me know if I understood you right. This will help me
> >         answer
> >>  your question on the Derby wiki page
> >>
> >
> http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >         <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >
> >>  <
> >
> http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >         <http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >>
> >         I
> >>  know that we don't need to get into the implementation code
> >         details in
> >>  the design phase, but I need to be able to picture this
> >         particular case
> >>  in my mind to understand where I am going.
> >>
> >>  thanks,
> >>  Mamta
> >>
> >>
> >>  On 3/15/07, *Mike Matrigali* < mikem_app@sbcglobal.net
> >         <mailto:mikem_app@sbcglobal.net>
> >>  <mailto:mikem_app@sbcglobal.net
> >         <mailto:mikem_app@sbcglobal.net>>> wrote:
> >>
> >>
> >>
> >>     Daniel John Debrunner wrote:
> >>      > Mamta Satoor wrote:
> >>      >
> >>     ...
> >>
> >>      >
> >>      > - At recovery time the btree uses the collation type and
> >         the data
> >>     value
> >>      > factory to setup its template row array correctly.
> >         Something like
> >>      >      for each dvd in row array
> >>      >         if (dvd instanceof StringDataValue)
> >>      >              dvd = dvd.getValue(dvf.getCharacterCollator
> >         (type));
> >>
> >>     Note that the store issue is not just a recovery time
> >         issue, templates
> >>     are required during normal runtime.  Creation of these
> >         templates used
> >>     to show up (a long time ago) in performance analysis and
> >         work was done
> >>     to optimize the performance.  So I am interested in making
> >         these
> >>     template creations as efficient as possible.
> >>
> >>     Your proposal above does not look right to me - it could
> >         just be I don't
> >>     understand where the psuedo code is.  The code I expect in
> >         store would
> >>     be something like below - letting the datafactory do
> >         whatever is right
> >>     based on the format id and the collation, if store is going
> >         to "own"
> >>     knowing
> >>     the collation of a given column then I would expect
> >         something like:
> >>
> >>     for each format id in row array
> >>         dvd = datavaluefactory.getObject(format id,
> >         character_collator_type)
> >>
> >>     note this means extra overhead for every object creation in
> >         the
> >>     template.
> >>
> >>     To me it seems unfortunate to pass in this info per column,
> >         when at
> >>     least in 10.3 the current code it is one per database.  I
> >         saw the
> >>     direction as:
> >>
> >>     o 10.3 only needs one collation per database so hide the
> >         info in the
> >>       datafactory, basically there is one DEFAULT collation per
> >         database.
> >>       Thus no need for second argument to
> >         datavaluefactory.getObject ()
> >>
> >>     o future release needs to have different collations per
> >         conglomerate,
> >>       then at that time we can store a collator type per
> >         conglomerate - we
> >>       have mechanism today to upgrade on the fly.  If we want
> >         to support
> >>       adding a collation to an existing database I would
> >         suggest continueing
> >>       the DEFAULT collation concept with some magic number
> >         representing
> >>       DEFAULT db collation in the datavaluefactory.getObject ()
> >         call - which
> >>       would mean use db wide default rather than specify
> >         specific one. For
> >>       new databases we would not need default, we could at that
> >         time
> >>     specify
> >>       one per conglomerate.
> >>       At this point we either change all the
> >         datavaluefactory.getObject()
> >>       calls to have 2 args and support DEFAULT_VALUE as second
> >         argument, or
> >>       maybe support both 1 and 2 arg calls - not sure.
> >>
> >>     0 future future release needs to have different collations
> >         per column,
> >>       then at that time we can store a collator type per column
> >         - we
> >>     continue to have mechanism to upgrade on fly as long as we
> >         can come up
> >>     with a default value for old tables.  Same issues as above.
> >>
> >>
> >>
> >>      >
> >>      > - setting the collation property remains in the data
> >         dictionary
> >>      >
> >>      > - basic database sets the locale for the
> >         DataValueFactory after
> >>     it boots
> >>      > it, using a new method on DVF
> >>      >         void setLocale(Locale locale);
> >>      >
> >>      > I think approaching the problem this way will lead to a
> >         cleaner
> >>     solution
> >>      > in the long term and be somewhat easier to implement.
> >>      >
> >>      > Thanks,
> >>      > Dan.
> >>      >
> >>      >
> >>      >
> >>      >
> >>      >
> >>      >
> >>
> >>
> >
> >
> >
>
>

Mime
View raw message