Mike asked "Mamta, are you close to an implmentation, maybe you could post a
patch so that I could work off of that while discussion continues?"
I hope to be able to post something in a days time if everything goes fine.
Mamta
On 4/16/07, Mike Matrigali <mikem_app@sbcglobal.net> wrote:
>
> Below as quoted by Mamta are my views on this. I was hoping that
> compares involving collation chars would not require twice the
> number of objects being created.
>
> Below describes how store uses InstanceGetter currently to optimize
> allocation of objects. I was hoping to preserve this performance
> for current non-collation datatypes and also to avoid needing to
> provide any additional collation information after the initial
> dvf.instanceGetterFromIdentifiers call.
>
> Just so I know where we are, Dan do you have a problem with the
> proposed interfaces, ie. are they in the right place and taking
> the right arguments? If so maybe we could incrementally implement
> the interfaces so that I could continue the store side while the
> implmentation discussion continues. I would be ok with an initial
> interface change that only supported current collation, so that
> I could at least verify the store changes.
>
> Mamta, are you close to an implmentation, maybe you could post a patch
> so that I could work off of that while discussion continues?
>
> Mamta Satoor wrote:
> > Hi Dan,
> >
> > Here are my attempts to answers your questions.
> >
> > "Why use InstanceGetter here?" Because Store wants to call the
> > InstanceGetter once and call getInstance on them multiple times. This is
> > for efficiency reasons. This is what is currently done but through
> > interfaces on Monitor rather than DVF. Mike, maybe you can share your
> > thoughts too on why Store does this.
> >
> >
> > "It doesn't have to return another DVD, it can return itself if it is of
> > the correct type, thus no additional overhead for UCS_BASIC collation.
> > Thus this switch would happen once for the first collation, not every
> > collation, and of course not happen at all if no collation is involved."
> > I agree, but with InstanceGetter approach, it doesn't even have to
> > happen once because we will be generating the right DVD in first place.
> >
> > "Could you show an example of how the store will be calling the code you
> > are describing? Maybe that would help me out."
> > Store would call something like following(this is copied from what Mike
> > wrote in this same thread, dated April 12th, 2nd mail from Mike, point
> > 3.) Again, Mike if you have more to add from the Store point of view,
> > please do so.
> >
> > Store will call following once
> > InstanceGetter = dvf.instanceGetterFromIdentifiers(format id,
> > collation id)
> >
> > Store will call following many times:
> > dvd = InstanceGetter.getNewInstance()
> >
> > The reason for doing it this way is explained by Mike below
> >
> > "3) optimized allocation, caching some of the work. This is used
> > where one query may generate large number of rows - for instance
> > hash table scan and sorter calls. Here the idea is to do some
> > part of the work once leaving an InstanceGetter which then can
> > repeatedly give back new objects in the most optimized way:
> >
> > again at this point dvd can be used to correctly compare against other
> > dvd's in possible collate specific ways."
> >
> > thanks,
> > Mamta
> > On 4/14/07, *Daniel John Debrunner* <djd@apache.org
> > <mailto:djd@apache.org>> wrote:
> >
> > Mamta Satoor wrote:
> > > Hi Dan,
> > >
> > > The problem we are trying to solve is provide a way to Store so
> > that it
> > > can call a method (say it's called
> > > getInstanceGetterForFormatIDandCollationType) on DVF with format
> id &
> > > collation type and get an InstanceGetter for that combination.
> >
> > Why use InstanceGetter here?
> >
> > > Like Mike
> > > mentioned in his earlier mail (in this same thread, dated April
> 12th,
> > > 2nd mail from Mike) with point 3), Store will call this method
> > once and
> > > call getInstance on that InstanceGetter multiple times to get the
> > right
> > > DVD. If we don't change the InstanceGetter as I suggested, then
> that
> > > would mean that we will be creating 2 DVD objects for every
> character
> > > DVD through Store code. The worst part is we will be doing this
> > > unnecessary creation of 2 DVDs even for databases which want
> default
> > > collation. The 2 DVD creation I am talking about are first,
> through
> > > InstanceGetter, we will get say SQLChar. Then at the time of
> actual
> > > collation comparison, it will have to call something like
> > > StringDataValue.getCollationValue(int collationType) to get
> > another DVD
> > > to make sure that the collation is being performed with write
> DVD.
> >
> > It doesn't have to return another DVD, it can return itself if it is
> of
> > the correct type, thus no additional overhead for UCS_BASIC
> collation.
> > Thus this switch would happen once for the first collation, not
> every
> > collation, and of course not happen at all if no collation is
> involved.
> >
> > > What I am suggesting does not make InstanceGetter complicated. It
> is
> > > pretty simple implementation. All I am proposing is to have
> special
> > > InstanceGetter class for collation sensitive DVDs. This new
> > > InstanceGetter class will have RuleBasedCollator (which will be
> > set the
> > > first time this InstanceGetter is created for the given database
> > through
> > > the DVF) and it will have collation type(this collation type will
> > always
> > > be set to whatever collation type the
> > > getInstanceGetterForFormatIDandCollationType was called with.
> This
> > > collation type will determine which kind of DVD to generate ie
> > one with
> > > default collation or one with terriotry based collation). You
> > mentioned
> > > in your mail that "I got a little lost in the details". Please
> let me
> > > know where it was unclear and I can try to explain it better.
> >
> > Could you show an example of how the store will be calling the code
> you
> > are describing? Maybe that would help me out.
> >
> > >
> > > As for your question about "does it take account of the fact that
> > the
> > > registered format ids are system wide and there can be databases
> with
> > > different default collations in the same system?" My
> understanding is
> > > that there is one DVF per database and these InstanceGetters will
> be
> > > saved on DVF and hence I do not forsee any problems in having
> > multiple
> > > databases with different collations in same Derby system.
> >
> > Dan.
> >
> >
>
>
|