db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?
Date Wed, 21 Mar 2007 04:55:56 GMT
Also, I will fix the following outstanding item on
http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
to say that the collation property should be added to properties
conglomerate and not services.properties.
2)At the time of upgrade of pre-10.3 database, we should make sure that
derby.database.collation property with value UCS_BASIC in added to
services.properties. This is because we do not plan on supporting collation
change for existing databases.


On 3/20/07, Mamta Satoor <msatoor@gmail.com> wrote:
>
> Thanks, Mike and Dan for your responses. Based on this and following from
> Dan's first mail in this thread
> ******start of part of Dan's first mail in this thread*******
> - basic database sets the locale for the DataValueFactory after it boots
> it, using a new method on DVF
>         void setLocale(Locale locale);
> ******end of part of Dan's first mail in this thread*******
> we donot need the collation attribute information at the DVF boot time. It
> is sufficient to have locale info set on DVF at the boot time using
> setLocale method by basic database. If store code calls DVF to give proper
> DVD using formatid and collation type, DVF can determine the correct
> RuleBasedCollator using the locale if the collation type is territory based.
> So, DVF has everything it needs to find the correct RuleBasedCollator for
> given collation type.
>
> I will go ahead and remove the following requirement from Outstanding
> items under http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>
> 1)Add jdbc url attribute COLLATION into services.properties as
> derby.database.collation property. If no COLLATION is specified at
> database create time, then have UCS_BASIC as the value for
> derby.database.collation We need the property in the services.propertiesrather than properties
conglomerate because
> DataValueFactory <http://wiki.apache.org/db-derby/DataValueFactory> needs
> this property before store has been booted completely.
>
> In addition, I will add an entry as follows under Implemented Items on http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
>
> At the time of database create time, optional JDBC url attribute
> COLLATION is validated by the boot code in data dictionary and the validated
> value of COLLATION(if none specified by user, then it will default to
> UCS_BASIC which is also the only collation available on pre-10.3databases) attribute
is saved as
> derby.database.collation property in the properties conglomerate. This
> work was done by revision 511283
>
> As always, any feedback is welcomed,
> Mamta
>
>  On 3/20/07, Mike Matrigali <mikem_app@sbcglobal.net > wrote:
> >
> >
> >
> > Mamta Satoor wrote:
> > > Mike, I am not sure if your question, about how in store DVD with
> > > correction collation type is loaded, was answered or not. In other
> > > words, you had question about following piece of pseudo code from Dan
> > >      if (dvd instanceof StringDataValue)
> > >              dvd = dvd.getValue(dvf.getCharacterCollator(type));
> > >
> > > Let me attempt to answer it. It will help clear up things in my mind
> > too
> > > and make sure that I am understanding this correctly.
> > >
> > > Currently,
> > > derby.impl.dtore.access.conglomerate.OpenConglomerateScratchSpace has
> > > get_row_for_export which first gets a class template row using
> > > RowUtil.newClassInfoTemplate This method in RowUtil calls
> > > Monitor.classFromIdentifier to get the InstanceGetter for each of the
> > > format ids identified by store. Once
> > > OpenConglomerateScratchSpace.get_row_for_export has the class template
> > > row, it will call RowUtil.newRowFromClassInfoTemplate. This is the
> > > method, Dan is proposing to modify, ie store should pass an additional
> > > array of int to  RowUtil.newRowFromClassInfoTemplate which will have
> > the
> > > collation type associated with the formatids of the template row.
> > > RowUtil.newRowFromClassInfoTemplate will first get the DVD as it does
> > > today using following
> > >                     columns[column_index] =
> > > (DataValueDescriptor)
> > classinfo_template[column_index].getNewInstance();
> > > In addition, it will need to do something like following
> > >      if (columns[column_index] instanceof StringDataValue)
> > >              dvd =
> > > columns[column_index].getValue(dvf.getCharacterCollator
> > (collationTypesForTemplateRows[column_index]));
> >
> > My opinion is that this work should be done in the datavalue factory and
> > not outside.  Dan suggested at one point that some of the work of
> > generating classes/instances should move from Monitor to datavalue
> > factory.
> >
> > So I was assuming something like RowUtil.newClassInfoTemplate instead
> > of calling Monitor.classFromIdentifier(format_ids[i]) get an array of
> > InstanceGetter's, it would call something like
> > datavaluefactory.classFromIdentifier(format_ids[i], collator_ids[i]) -
> > then every InstanceGetter would produce the right type with collator set
> > from then on.
> >
> >
> > Internal to dvf it can do the work of checking for instanceof if it
> > needs to, but because it is inside dvf maybe it can do something smarter
> > .
> > >
> > > Dan, let me know if I understood you right. This will help me answer
> > > your question on the Derby wiki page
> > > http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> >
> > > < http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> > > I
> > > know that we don't need to get into the implementation code details in
> >
> > > the design phase, but I need to be able to picture this particular
> > case
> > > in my mind to understand where I am going.
> > >
> > > thanks,
> > > Mamta
> > >
> > >
> > > On 3/15/07, *Mike Matrigali* < mikem_app@sbcglobal.net
> > > <mailto:mikem_app@sbcglobal.net>> wrote:
> > >
> > >
> > >
> > >     Daniel John Debrunner wrote:
> > >      > Mamta Satoor wrote:
> > >      >
> > >     ...
> > >
> > >      >
> > >      > - At recovery time the btree uses the collation type and the
> > data
> > >     value
> > >      > factory to setup its template row array correctly. Something
> > like
> > >      >      for each dvd in row array
> > >      >         if (dvd instanceof StringDataValue)
> > >      >              dvd = dvd.getValue(dvf.getCharacterCollator(type));
> > >
> > >     Note that the store issue is not just a recovery time issue,
> > templates
> > >     are required during normal runtime.  Creation of these templates
> > used
> > >     to show up (a long time ago) in performance analysis and work was
> > done
> > >     to optimize the performance.  So I am interested in making these
> > >     template creations as efficient as possible.
> > >
> > >     Your proposal above does not look right to me - it could just be I
> > don't
> > >     understand where the psuedo code is.  The code I expect in store
> > would
> > >     be something like below - letting the datafactory do whatever is
> > right
> > >     based on the format id and the collation, if store is going to
> > "own"
> > >     knowing
> > >     the collation of a given column then I would expect something
> > like:
> > >
> > >     for each format id in row array
> > >         dvd = datavaluefactory.getObject(format id,
> > character_collator_type)
> > >
> > >     note this means extra overhead for every object creation in the
> > >     template.
> > >
> > >     To me it seems unfortunate to pass in this info per column, when
> > at
> > >     least in 10.3 the current code it is one per database.  I saw the
> > >     direction as:
> > >
> > >     o 10.3 only needs one collation per database so hide the info in
> > the
> > >       datafactory, basically there is one DEFAULT collation per
> > database.
> > >       Thus no need for second argument to datavaluefactory.getObject()
> > >
> > >     o future release needs to have different collations per
> > conglomerate,
> > >       then at that time we can store a collator type per conglomerate
> > - we
> > >       have mechanism today to upgrade on the fly.  If we want to
> > support
> > >       adding a collation to an existing database I would suggest
> > continueing
> > >       the DEFAULT collation concept with some magic number
> > representing
> > >       DEFAULT db collation in the datavaluefactory.getObject () call -
> > which
> > >       would mean use db wide default rather than specify specific one.
> > For
> > >       new databases we would not need default, we could at that time
> > >     specify
> > >       one per conglomerate.
> > >       At this point we either change all the
> > datavaluefactory.getObject()
> > >       calls to have 2 args and support DEFAULT_VALUE as second
> > argument, or
> > >       maybe support both 1 and 2 arg calls - not sure.
> > >
> > >     0 future future release needs to have different collations per
> > column,
> > >       then at that time we can store a collator type per column - we
> > >     continue to have mechanism to upgrade on fly as long as we can
> > come up
> > >     with a default value for old tables.  Same issues as above.
> > >
> > >
> > >
> > >      >
> > >      > - setting the collation property remains in the data dictionary
> > >      >
> > >      > - basic database sets the locale for the DataValueFactory after
> >
> > >     it boots
> > >      > it, using a new method on DVF
> > >      >         void setLocale(Locale locale);
> > >      >
> > >      > I think approaching the problem this way will lead to a cleaner
> >
> > >     solution
> > >      > in the long term and be somewhat easier to implement.
> > >      >
> > >      > Thanks,
> > >      > Dan.
> > >      >
> > >      >
> > >      >
> > >      >
> > >      >
> > >      >
> > >
> > >
> >
> >
>

Mime
View raw message