db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?
Date Wed, 21 Mar 2007 04:44:10 GMT
Thanks, Mike and Dan for your responses. Based on this and following from
Dan's first mail in this thread
******start of part of Dan's first mail in this thread*******
- basic database sets the locale for the DataValueFactory after it boots it,
using a new method on DVF
        void setLocale(Locale locale);
******end of part of Dan's first mail in this thread*******
we donot need the collation attribute information at the DVF boot time. It
is sufficient to have locale info set on DVF at the boot time using
setLocale method by basic database. If store code calls DVF to give proper
DVD using formatid and collation type, DVF can determine the correct
RuleBasedCollator using the locale if the collation type is territory based.
So, DVF has everything it needs to find the correct RuleBasedCollator for
given collation type.

I will go ahead and remove the following requirement from Outstanding items
under http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478

1)Add jdbc url attribute COLLATION into services.properties as
derby.database.collation property. If no COLLATION is specified at database
create time, then have UCS_BASIC as the value for
derby.database.collationWe need the property in the
services.properties rather than properties conglomerate because
DataValueFactory <http://wiki.apache.org/db-derby/DataValueFactory> needs
this property before store has been booted completely.

In addition, I will add an entry as follows under Implemented Items on
http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478

At the time of database create time, optional JDBC url attribute
COLLATION is validated by the boot code in data dictionary and the validated
value of COLLATION(if none specified by user, then it will default to
UCS_BASIC which is also the only collation available on pre-10.3 databases)
attribute is saved as derby.database.collation property in the properties
conglomerate. This work was done by revision 511283

As always, any feedback is welcomed,
Mamta

On 3/20/07, Mike Matrigali <mikem_app@sbcglobal.net > wrote:
>
>
>
> Mamta Satoor wrote:
> > Mike, I am not sure if your question, about how in store DVD with
> > correction collation type is loaded, was answered or not. In other
> > words, you had question about following piece of pseudo code from Dan
> >      if (dvd instanceof StringDataValue)
> >              dvd = dvd.getValue(dvf.getCharacterCollator(type));
> >
> > Let me attempt to answer it. It will help clear up things in my mind too
>
> > and make sure that I am understanding this correctly.
> >
> > Currently,
> > derby.impl.dtore.access.conglomerate.OpenConglomerateScratchSpace has
> > get_row_for_export which first gets a class template row using
> > RowUtil.newClassInfoTemplate This method in RowUtil calls
> > Monitor.classFromIdentifier to get the InstanceGetter for each of the
> > format ids identified by store. Once
> > OpenConglomerateScratchSpace.get_row_for_export has the class template
> > row, it will call RowUtil.newRowFromClassInfoTemplate. This is the
> > method, Dan is proposing to modify, ie store should pass an additional
> > array of int to  RowUtil.newRowFromClassInfoTemplate which will have the
> > collation type associated with the formatids of the template row.
> > RowUtil.newRowFromClassInfoTemplate will first get the DVD as it does
> > today using following
> >                     columns[column_index] =
> > (DataValueDescriptor) classinfo_template[column_index].getNewInstance();
> > In addition, it will need to do something like following
> >      if (columns[column_index] instanceof StringDataValue)
> >              dvd =
> > columns[column_index].getValue(dvf.getCharacterCollator
> (collationTypesForTemplateRows[column_index]));
>
> My opinion is that this work should be done in the datavalue factory and
> not outside.  Dan suggested at one point that some of the work of
> generating classes/instances should move from Monitor to datavalue
> factory.
>
> So I was assuming something like RowUtil.newClassInfoTemplate instead
> of calling Monitor.classFromIdentifier(format_ids[i]) get an array of
> InstanceGetter's, it would call something like
> datavaluefactory.classFromIdentifier(format_ids[i], collator_ids[i]) -
> then every InstanceGetter would produce the right type with collator set
> from then on.
>
>
> Internal to dvf it can do the work of checking for instanceof if it
> needs to, but because it is inside dvf maybe it can do something smarter .
> >
> > Dan, let me know if I understood you right. This will help me answer
> > your question on the Derby wiki page
> > http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> > < http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478
> > I
> > know that we don't need to get into the implementation code details in
> > the design phase, but I need to be able to picture this particular case
> > in my mind to understand where I am going.
> >
> > thanks,
> > Mamta
> >
> >
> > On 3/15/07, *Mike Matrigali* < mikem_app@sbcglobal.net
> > <mailto:mikem_app@sbcglobal.net>> wrote:
> >
> >
> >
> >     Daniel John Debrunner wrote:
> >      > Mamta Satoor wrote:
> >      >
> >     ...
> >
> >      >
> >      > - At recovery time the btree uses the collation type and the data
> >     value
> >      > factory to setup its template row array correctly. Something like
>
> >      >      for each dvd in row array
> >      >         if (dvd instanceof StringDataValue)
> >      >              dvd = dvd.getValue(dvf.getCharacterCollator (type));
> >
> >     Note that the store issue is not just a recovery time issue,
> templates
> >     are required during normal runtime.  Creation of these templates
> used
> >     to show up (a long time ago) in performance analysis and work was
> done
> >     to optimize the performance.  So I am interested in making these
> >     template creations as efficient as possible.
> >
> >     Your proposal above does not look right to me - it could just be I
> don't
> >     understand where the psuedo code is.  The code I expect in store
> would
> >     be something like below - letting the datafactory do whatever is
> right
> >     based on the format id and the collation, if store is going to "own"
> >     knowing
> >     the collation of a given column then I would expect something like:
> >
> >     for each format id in row array
> >         dvd = datavaluefactory.getObject(format id,
> character_collator_type)
> >
> >     note this means extra overhead for every object creation in the
> >     template.
> >
> >     To me it seems unfortunate to pass in this info per column, when at
> >     least in 10.3 the current code it is one per database.  I saw the
> >     direction as:
> >
> >     o 10.3 only needs one collation per database so hide the info in the
> >       datafactory, basically there is one DEFAULT collation per
> database.
> >       Thus no need for second argument to datavaluefactory.getObject ()
> >
> >     o future release needs to have different collations per
> conglomerate,
> >       then at that time we can store a collator type per conglomerate -
> we
> >       have mechanism today to upgrade on the fly.  If we want to support
>
> >       adding a collation to an existing database I would suggest
> continueing
> >       the DEFAULT collation concept with some magic number representing
> >       DEFAULT db collation in the datavaluefactory.getObject () call -
> which
> >       would mean use db wide default rather than specify specific one.
> For
> >       new databases we would not need default, we could at that time
> >     specify
> >       one per conglomerate.
> >       At this point we either change all the datavaluefactory.getObject
> ()
> >       calls to have 2 args and support DEFAULT_VALUE as second argument,
> or
> >       maybe support both 1 and 2 arg calls - not sure.
> >
> >     0 future future release needs to have different collations per
> column,
> >       then at that time we can store a collator type per column - we
> >     continue to have mechanism to upgrade on fly as long as we can come
> up
> >     with a default value for old tables.  Same issues as above.
> >
> >
> >
> >      >
> >      > - setting the collation property remains in the data dictionary
> >      >
> >      > - basic database sets the locale for the DataValueFactory after
> >     it boots
> >      > it, using a new method on DVF
> >      >         void setLocale(Locale locale);
> >      >
> >      > I think approaching the problem this way will lead to a cleaner
> >     solution
> >      > in the long term and be somewhat easier to implement.
> >      >
> >      > Thanks,
> >      > Dan.
> >      >
> >      >
> >      >
> >      >
> >      >
> >      >
> >
> >
>
>

Mime
View raw message