Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Received-SPF: neutral (herse.apache.org: local policy)
Message-ID: <461FE682.1000108@sbcglobal.net>
Date: Fri, 13 Apr 2007 13:22:26 -0700
From: Mike Matrigali <mikem_app@sbcglobal.net>
Reply-To: mikem_app@sbcglobal.net
User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206)
MIME-Version: 1.0
To: derby-dev@db.apache.org
Subject: Re: how should store get an object based on format id and collation
 id?
References: <461D6E77.9000500@sbcglobal.net>
	 <d9619e4a0704111801g56a35649v1eb7a41fae078c27@mail.gmail.com>
	 <461D8825.2070204@sbcglobal.net>
	 <d9619e4a0704111835j77aaeee8ja13e099d779ea1c9@mail.gmail.com>
	 <461E5E79.8030605@sbcglobal.net>
	 <d9619e4a0704121016t34854b37td0e6f5f1e5e7b8f3@mail.gmail.com>
	 <461E9B3F.5060106@sbcglobal.net>
 <d9619e4a0704131213r2399d223h481824c3362de901@mail.gmail.com>
In-Reply-To: <d9619e4a0704131213r2399d223h481824c3362de901@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit


Mamta Satoor wrote:
> Hi Mike,
>  
> I didn't quite understand following
> *****************Mike wrote**************
> 2) using existing dvd's class to get a new "empty" dvd that matches it
>    (which is why it does not call clone).
>    dvd = dvd.getClass().newInstance()
> 
>    o less sure about this one.  Seems like we need a new dvd interface
>      that does the equivalent thing.  I believe the original code got
>      here because the original store code did not deal with DVD's it
>      just got objects, so could not make dvd calls.  There is a
>      getNewNull() interface, anyone know if there is any runtime work
>      that would be saved over this by creating a
>      getNewEmpty() interface?
> 
>     dvd = dvd.getNewEmpty();
> ******************************************
> What I think you are saying is
> "There is someplace in the Derby code, where we do 
> dvd.getClass().newInstance(). And the reason it is done this way is that 
> the calling code does not know that it is dealing with a dvd object.
The code now knows that it is dealing with a dvd object, the orignal
implementation probably did not and thus the reason it is currently
using getClass() rather than a call on the dvd.  I did some searching
through the code and it looks like some sort code would also benefit
from a getNewEmpty() interface if it is faster than the current
getClone interface.  It looks like code today is calling getclone there
to create new empty objects which will be read from disk so paying
unnecessary overhead - especially for datatypes that may be doing
addition object allocations to maintain internal state.

> Maybe that code can now check if it is dealing with a StringDataValue 
> and if so, then have the code call dvd.getNewEmpty which will be defined 
> only on StringDataValue. 
I don't think code outside of the datafactory should do this kind of 
stuff.  It seems cleaner if dvd's provide a single interface for all
datatypes, and callers should not be checking type before making a
subsequent call.  At least in this case it is possible for store to
make the call, in the other case store will only have a format id so
no real way to ask the type anything.  And I would really like to avoid
the case where we might have to do 2 object allocations to get a correct
collation type (ie. one allocation on the format id, and then check
something and then ask for another object based on the current object).
The goal for 1, 2 and 3 should be a single object allocation given the
information provided and returning a correct object with correct 
collation info, with as little or no overhead as possible to datatypes that
don't really care about collation info.


The getNewEmpty() method will copy the
> RuleBasedCollator info to a new instance of StringDataValue and return 
> that." Did I understand you right? Also, I am just curious where is this 
> code dvd.getClass().newInstance() right now?
The code is in 
java/engine/org/apache/derby/impl/store/access/conglomerate/TemplateRow.java!newRow()

The comment says that it is more efficient to allocate new objects based 
on existing template object than calling the monitor.  This was the 
observation from measurements a long time ago - no idea if they are 
still valid in newest JVM's.


>  
> thanks,
> Mamta
>  
> On 4/12/07, *Mike Matrigali* <mikem_app@sbcglobal.net 
> <mailto:mikem_app@sbcglobal.net>> wrote:
> 
> 
> 
>     Mamta Satoor wrote:
>      > Mike, the following code will be part of DataValueFactory and
>     hence it
>      > will be part of the interface. Please let me know if I am not
>     very clear
>      > with what I am proposing or if you forsee problems with this logic.
>      > if (dvd instanceof StringDataValue)
>      >               dvd = dvd.getValue(dvf.getCharacterCollator(type));
> 
>     My comment isn't really the logic, I think we are just not talking about
>     the same area.  I think the code above belongs hidden behind the new
>     interfaces in the implementation logic of the data factory and data
>     types, not an example of what callers of the datatype should be doing.
>      >
>      > Also, in the following line below
>      > "I'll look at building/using DataFactory interface.  It will be some"
>      > you mean DataValueFactory interface, right?
>      >
>      > Mamta
> 
>     Yes I meant DataValueFactory interface.  Let's work together on getting
>     the DataValueFactory interface right.
> 
>     So far I have uncovered to basic ways store creates "empty" objects.
>     Note that store really only needs "empty" objects, ie. it is going
>     to initialize the state of these objects from disk by calling each
>     objects readExternal() method.  But we have decided to not store
>     the collation info as state in the object so somehow we need to get
>     that info into the empty objects.
> 
>     The ways store currently creates these objects:
> 
>     1) using Monitor to get dvd directly:
>        dvd = Monitor.newInstanceFromIdentifier (format id)
> 
>        o I think this use is best implemented as Mamta suggests, just
>          providing a non-static interface on the DataValueFactory.
>          something like:
> 
>          DataValueFactory dvf = somehow cache and pass this around store;
>          dvd = dvf.newInstance(format id, collation id);
> 
>          at this point dvd can be used to correctly compare against other
>          dvd's in possible collate specific ways.
> 
>     2) using existing dvd's class to get a new "empty" dvd that matches it
>        (which is why it does not call clone).
>        dvd = dvd.getClass().newInstance()
> 
>        o less sure about this one.  Seems like we need a new dvd interface
>          that does the equivalent thing.  I believe the original code got
>          here because the original store code did not deal with DVD's it
>          just got objects, so could not make dvd calls.  There is a
>          getNewNull() interface, anyone know if there is any runtime work
>          that would be saved over this by creating a
>          getNewEmpty() interface?
> 
>         dvd = dvd.getNewEmpty();
> 
>         at this point dvd can be used to correctly compare against other
>          dvd's in possible collate specific ways.
> 
>     3) optimized allocation, caching some of the work.  This is used
>        where one query may generate large number of rows - for instance
>        hash table scan and sorter calls.  Here the idea is to do some
>        part of the work once leaving an InstanceGetter which then can
>        repeatedly give back new objects in the most optimized way:
> 
>        called once:
>        InstanceGetter = Monitor.classFromIdentifier(format id)
> 
>        called many times:
>        dvd = InstanceGetter.getNewInstance()
> 
>        o something like the following would be the direct conversion.  Note
>          that implementation of the Instance getter is probably more complex
>          now.  It can't just remember a single class and call new instance
>          on it.  It has to cache some info on what class to create and what
>          collation to set in it.
> 
>        called once
>        DataValueFactory dvf = somehow cache and pass this around store;
>        InstanceGetter =
>              dvf.instanceGetterFromIdentifiers(format id, collation id)
> 
>        called many times:
>        dvd = InstanceGetter.getNewInstance()
> 
>     again at this point dvd can be used to correctly compare against other
>          dvd's in possible collate specific ways.
> 
> 
> 
>     All 3 of these uses have to be replaced to allow store to create
>     "correct" types which can be used in possible string comparisons.
> 
> 
> 
>