db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: how should store get an object based on format id and collation id?
Date Sat, 14 Apr 2007 22:54:02 GMT
Hi Dan,

The problem we are trying to solve is provide a way to Store so that it can
call a method (say it's called getInstanceGetterForFormatIDandCollationType)
on DVF with format id & collation type and get an InstanceGetter for that
combination. Like Mike mentioned in his earlier mail (in this same thread,
dated April 12th, 2nd mail from Mike) with point 3), Store will call this
method once and call getInstance on that InstanceGetter multiple times to
get the right DVD. If we don't change the InstanceGetter as I suggested,
then that would mean that we will be creating 2 DVD objects for every
character DVD through Store code. The worst part is we will be doing this
unnecessary creation of 2 DVDs even for databases which want default
collation. The 2 DVD creation I am talking about are first, through
InstanceGetter, we will get say SQLChar. Then at the time of actual
collation comparison, it will have to call something like
StringDataValue.getCollationValue(int collationType) to get another DVD to
make sure that the collation is being performed with write DVD.

What I am suggesting does not make InstanceGetter complicated. It is pretty
simple implementation. All I am proposing is to have special InstanceGetter
class for collation sensitive DVDs. This new InstanceGetter class will have
RuleBasedCollator (which will be set the first time this InstanceGetter is
created for the given database through the DVF) and it will have collation
type(this collation type will always be set to whatever collation type the
getInstanceGetterForFormatIDandCollationType was called with. This collation
type will determine which kind of DVD to generate ie one with default
collation or one with terriotry based collation). You mentioned in your mail
that "I got a little lost in the details". Please let me know where it was
unclear and I can try to explain it better.

As for your question about "does it take account of the fact that the
registered format ids are system wide and there can be databases with
different default collations in the same system?" My understanding is that
there is one DVF per database and these InstanceGetters will be saved on DVF
and hence I do not forsee any problems in having multiple databases with
different collations in same Derby system.

thanks,
Mamta


On 4/14/07, Daniel John Debrunner <djd@apache.org> wrote:
>
> Mamta Satoor wrote:
> > I spent some time on points 1(using Monitor to get dvd directly) and
> > 3(optimized allocation, caching some of the work.) which requires us to
> > solve the problem of how to get the InstanceGetter to return the correct
> > DVD for character types. Let me first briefly describe how the
> > InstanceGetter works for DVDs currently.
>
> I'm a little unclear on exactly the problem this is trying to solve. I
> got a little lost in the details, but does it take account of the fact
> that the registered format ids are system wide and there can be
> databases with different default collations in the same system?
>
> Also the use of InstanceGetters seem to complicate this issue, once one
> knows one has a collation type and one is using the DataValueFactory
> then one can have methods on DVF that return DataValueDescriptors
> directly, no need to go through the indirection of InstanceGetters. They
> are a mechanism used when the type of the object is not known, here the
> type is known as a DVD.
>
> One of the points to note is that the correct DVD type for collation is
> only needed when collation is actually occurring. If a collator based
> column is read in using SQLChar then it's not a problem as long as a
> switch to the collator version occurs during comparisons. Earlier I has
> suggested methods to perform this switch on StringDataType, something
> like getCollationValue(int collationType).
>
> Dan.
>
> >
> > ***********description on InstanceGetter for DVD********
> > I think the code dealing with getting an InstanceGetter for a DVD from a
> > formatid is currently isolated in BaseMonitor.classFromIdentifier(int
> > fmtId). BaseMonitor has a class level field called rc2 which is an array
> > of same length as  StoredFormatIds.TwoByte. The elements in rc2 will be
> > InstanceGetters. Every time BaseMonitor.classFromIdentifier(int fmtId)
> > is called, the method first checks if there is already an InstanceGetter
> > in the rc2 array for the passed format id. If yes, then it simply
> > returns that cached InstanceGetter from rc2. But if this is the first
> > time this method is being called for the passed format id, then we first
> > get the name of the InstanceGetter from RegisteredFormatIds using the
> > format id passed to the method. (For DVDs, the name
> > of that InstanceGetter would be
> > org.apache.derby.iapi.types.DTSClassInfo). Using that name from
> > RegisteredFormatIds, we create a Class object(for DVDs, that Class
> > object would be DTSClassInfo) and check if that Class is of
> > type FormatableInstanceGetter. If yes, then we create an instance of
> > that Class object(for DVDs, this will return an object of type
> > DTSClassInfo) and set the format id on it. And as a last step, we cache
> > this FormatableInstanceGetter in the rc2 array for future. So, in
> > future, if BaseMonitor.classFromIdentifier(int fmtId) gets called for
> > the same fmtId, we can simply return the cached InstanceGetter from rc2.
> > ************************************************************
> >
> > This current code will work fine for non-character type DVDs in Derby
> > 10.3 but it won't work for character type DVDs. For example for the
> > format id corresponding to SQL type CHAR, we want to return DVD of type
> > either SQLChar or CollatorSQLChar, depending on the value of collation
> > type. But existing code will always return SQLChar. What we want is for
> > one format id to represent 2 DVDs and the deciding factor is the
> > collation type. In order to support this, I am proposing following
> > changes to the logic above so that we can have InstanceGetter return the
> > correct DVD, even for character types.
> >
> > **********************************changes proposed to
> > InstanceGetter******************
> > For collation sensitive format ids (those corresponding to character
> > types), I am proposing to create a new InstanceGetter class called
> > CollationSensitiveDTSClassInfo which will extend DTSClassInfo . We will
> > change RegisteredFormatIds.TwoByte for such format ids to use
> > org.apache.derby.iapi.types.CollationSensitiveDTSClassInfo. We will also
> > need to remove the code for collation sensitive format ids from
> > DTSClassInfo since they will be handled in the new InstanceGetter, which
> > is CollationSensitiveDTSClassInfo.This new InstanceGetter class will
> > have two additional fields called collatorForDVD and collationType. And
> > it will have 2 setter methods, namely, setRuleBasedCollator and
> > setCollationType. The public Object getNewInstance() method on this
> > InstanceGetter will have code like following (Note that, I will need to
> > add a new constructor on CollatorSQL.. classes to take just the
> > RuleBasedCollator.)
> >
> >                switch (fmtId) {
> >                 /* Wrappers */
> >                 case StoredFormatIds.SQL_CHAR_ID:
> >                       if (collationType == StringDataValue.UCS_BASIC)
> >                            return new SQLChar();
> >                       else
> >                            return new CollatorSQLChar(collatorForDVD);
> >                 case StoredFormatIds.SQL_VARCHAR_ID:
> >                       if (collationType == StringDataValue.UCS_BASIC )
> >                            return new SQLVarchar();
> >                       else
> >                            return new
> CollatorSQLVarchar(collatorForDVD);
> >                 case StoredFormatIds.SQL_LONGVARCHAR_ID:
> >                       if (collationType == StringDataValue.UCS_BASIC)
> >                            return new SQLLongvarchar();
> >                       else
> >                            return new
> > CollatorSQLLongvarchar(collatorForDVD);
> >                 case StoredFormatIds.SQL_CLOB_ID:
> >                       if (collationType == StringDataValue.UCS_BASIC)
> >                            return new SQLClob();
> >                       else
> >                            return new CollatorSQLClob(collatorForDVD);
> >                 default: return null;
> >                }
> > The collatorForDVD will need to be set on this new InstanceGetter only
> > the first time around when it is created. If user has requested
> > territory based collation, then collatorForDVD will be set to the
> > Collator that is derived from the database's territory. If user wants
> > UCS_BASIC collation, then collatorForDVD will be set to JVM's default
> > Collator. The collationType is subject to change depending on if store
> > is looking for character types belonging to system tables (such types
> > will always have collation type of UCS_BASIC) or for character types
> > belonging to non-system tables (such types will have the collation type
> > of UCS_BASIC/TERRITORY_BASED depending on what user has requested for
> > the database). Based on this, the logic for
> > DVF.instanceGetterFromIdentifiers(fmtId, collationType) will look as
> follows
> >
> > DVF will have a class level field called instanceGettersForFormatIds
> > which will be an array of same length as  StoredFormatIds.TwoByte. The
> > elements in instanceGettersForFormatIds will be InstanceGetters. Every
> > time DVF.instanceGetterFromIdentifiers (int fmtId, int collationType)
> > will be called, the method will first check if there is already an
> > InstanceGetter in the instanceGettersForFormatIds array for the passed
> > format id. If yes, then it will check if the instanceGetter is of type
> > CollationSensitiveDTSClassInfo and if yes, then it will set the
> > collationType on that InstanceGetter to the collationType passed to
> > instanceGetterFromIdentifiers method and it will return that
> > InstanceGetter. If the InstanceGetter is not
> > CollationSensitiveDTSClassInfo, then it will simply return the
> > InstanceGetter obtained from the instanceGettersForFormatIds array.
> >
> > In the case, DVF.instanceGetterFromIdentifiers(int fmtId, int
> > collationType) does not find InstanceGetter cached for the passed format
> > id in instanceGettersForFormatIds array, then it will first get the name
> > of the InstanceGetter from RegisteredFormatIds using the format id
> > passed to the method. (For non-character DVDs, the name
> > of that InstanceGetter would be
> > org.apache.derby.iapi.types.DTSClassInfo. For character DVDs, the name
> > of that InstanceGetter would be
> > org.apache.derby.iapi.types.CollationSensitiveDTSClassInfo). Using that
> > name from RegisteredFormatIds, we will create a Class object(for DVDs,
> > that Class object would be
> > DTSClassInfo/CollationSensitiveDTSClassInfo) and will check if that
> > Class is of type  FormatableInstanceGetter. If yes, then we create an
> > instance of that Class object(for non-character DVDs, this will return
> > an object of type DTSClassInfo. For character DVDs, this will return an
> > object of type CollationSensitiveDTSClassInfo) and set the format id on
> > it. For non-character DVDs, as a last step, we will cache this
> > FormatableInstanceGetter in the instanceGettersForFormatIds array for
> > future. But for character DVDs, we will set the collationType and
> > RuleBasedCollator on the InstanceGetter AND then save it in
> > instanceGettersForFormatIds.
> >
> > As usual, I might have provided lot of information but hopefully it will
> > help understand the logic clearly. I will start looking at implementing
> > this but if anyone has any feedback on the logic, I will appreciate
> that.
> >
> > thanks,
> > Mamta
> >
> > On 4/12/07, *Mike Matrigali* <mikem_app@sbcglobal.net
> > <mailto:mikem_app@sbcglobal.net>> wrote:
> >
> >
> >
> >     Mamta Satoor wrote:
> >      > Mike, the following code will be part of DataValueFactory and
> >     hence it
> >      > will be part of the interface. Please let me know if I am not
> >     very clear
> >      > with what I am proposing or if you forsee problems with this
> logic.
> >      > if (dvd instanceof StringDataValue)
> >      >               dvd = dvd.getValue(dvf.getCharacterCollator(type));
> >
> >     My comment isn't really the logic, I think we are just not talking
> about
> >     the same area.  I think the code above belongs hidden behind the new
> >     interfaces in the implementation logic of the data factory and data
> >     types, not an example of what callers of the datatype should be
> doing.
> >      >
> >      > Also, in the following line below
> >      > "I'll look at building/using DataFactory interface.  It will be
> some"
> >      > you mean DataValueFactory interface, right?
> >      >
> >      > Mamta
> >
> >     Yes I meant DataValueFactory interface.  Let's work together on
> getting
> >     the DataValueFactory interface right.
> >
> >     So far I have uncovered to basic ways store creates "empty" objects.
> >     Note that store really only needs "empty" objects, ie. it is going
> >     to initialize the state of these objects from disk by calling each
> >     objects readExternal() method.  But we have decided to not store
> >     the collation info as state in the object so somehow we need to get
> >     that info into the empty objects.
> >
> >     The ways store currently creates these objects:
> >
> >     1) using Monitor to get dvd directly:
> >        dvd = Monitor.newInstanceFromIdentifier (format id)
> >
> >        o I think this use is best implemented as Mamta suggests, just
> >          providing a non-static interface on the DataValueFactory.
> >          something like:
> >
> >          DataValueFactory dvf = somehow cache and pass this around
> store;
> >          dvd = dvf.newInstance(format id, collation id);
> >
> >          at this point dvd can be used to correctly compare against
> other
> >          dvd's in possible collate specific ways.
> >
> >     2) using existing dvd's class to get a new "empty" dvd that matches
> it
> >        (which is why it does not call clone).
> >        dvd = dvd.getClass().newInstance()
> >
> >        o less sure about this one.  Seems like we need a new dvd
> interface
> >          that does the equivalent thing.  I believe the original code
> got
> >          here because the original store code did not deal with DVD's it
> >          just got objects, so could not make dvd calls.  There is a
> >          getNewNull() interface, anyone know if there is any runtime
> work
> >          that would be saved over this by creating a
> >          getNewEmpty() interface?
> >
> >         dvd = dvd.getNewEmpty();
> >
> >         at this point dvd can be used to correctly compare against other
> >          dvd's in possible collate specific ways.
> >
> >     3) optimized allocation, caching some of the work.  This is used
> >        where one query may generate large number of rows - for instance
> >        hash table scan and sorter calls.  Here the idea is to do some
> >        part of the work once leaving an InstanceGetter which then can
> >        repeatedly give back new objects in the most optimized way:
> >
> >        called once:
> >        InstanceGetter = Monitor.classFromIdentifier(format id)
> >
> >        called many times:
> >        dvd = InstanceGetter.getNewInstance()
> >
> >        o something like the following would be the direct
> conversion.  Note
> >          that implementation of the Instance getter is probably more
> complex
> >          now.  It can't just remember a single class and call new
> instance
> >          on it.  It has to cache some info on what class to create and
> what
> >          collation to set in it.
> >
> >        called once
> >        DataValueFactory dvf = somehow cache and pass this around store;
> >        InstanceGetter =
> >              dvf.instanceGetterFromIdentifiers(format id, collation id)
> >
> >        called many times:
> >        dvd = InstanceGetter.getNewInstance()
> >
> >     again at this point dvd can be used to correctly compare against
> other
> >          dvd's in possible collate specific ways.
> >
> >
> >
> >     All 3 of these uses have to be replaced to allow store to create
> >     "correct" types which can be used in possible string comparisons.
> >
> >
> >
> >
>
>
>

Mime
View raw message