Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 30969 invoked from network); 13 Apr 2007 20:23:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Apr 2007 20:23:28 -0000 Received: (qmail 61344 invoked by uid 500); 13 Apr 2007 20:23:33 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 61315 invoked by uid 500); 13 Apr 2007 20:23:33 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 61305 invoked by uid 99); 13 Apr 2007 20:23:33 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2007 13:23:33 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [32.97.110.152] (HELO e34.co.us.ibm.com) (32.97.110.152) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2007 13:23:26 -0700 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e34.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id l3DKN5gn014069 for ; Fri, 13 Apr 2007 16:23:05 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l3DKN4Un557380 for ; Fri, 13 Apr 2007 16:23:05 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l3DKN487019623 for ; Fri, 13 Apr 2007 16:23:04 -0400 Received: from [127.0.0.1] (sig-9-48-106-115.mts.ibm.com [9.48.106.115]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l3DKN3kT019530 for ; Fri, 13 Apr 2007 16:23:04 -0400 Message-ID: <461FE682.1000108@sbcglobal.net> Date: Fri, 13 Apr 2007 13:22:26 -0700 From: Mike Matrigali Reply-To: mikem_app@sbcglobal.net User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: derby-dev@db.apache.org Subject: Re: how should store get an object based on format id and collation id? References: <461D6E77.9000500@sbcglobal.net> <461D8825.2070204@sbcglobal.net> <461E5E79.8030605@sbcglobal.net> <461E9B3F.5060106@sbcglobal.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Mamta Satoor wrote: > Hi Mike, > > I didn't quite understand following > *****************Mike wrote************** > 2) using existing dvd's class to get a new "empty" dvd that matches it > (which is why it does not call clone). > dvd = dvd.getClass().newInstance() > > o less sure about this one. Seems like we need a new dvd interface > that does the equivalent thing. I believe the original code got > here because the original store code did not deal with DVD's it > just got objects, so could not make dvd calls. There is a > getNewNull() interface, anyone know if there is any runtime work > that would be saved over this by creating a > getNewEmpty() interface? > > dvd = dvd.getNewEmpty(); > ****************************************** > What I think you are saying is > "There is someplace in the Derby code, where we do > dvd.getClass().newInstance(). And the reason it is done this way is that > the calling code does not know that it is dealing with a dvd object. The code now knows that it is dealing with a dvd object, the orignal implementation probably did not and thus the reason it is currently using getClass() rather than a call on the dvd. I did some searching through the code and it looks like some sort code would also benefit from a getNewEmpty() interface if it is faster than the current getClone interface. It looks like code today is calling getclone there to create new empty objects which will be read from disk so paying unnecessary overhead - especially for datatypes that may be doing addition object allocations to maintain internal state. > Maybe that code can now check if it is dealing with a StringDataValue > and if so, then have the code call dvd.getNewEmpty which will be defined > only on StringDataValue. I don't think code outside of the datafactory should do this kind of stuff. It seems cleaner if dvd's provide a single interface for all datatypes, and callers should not be checking type before making a subsequent call. At least in this case it is possible for store to make the call, in the other case store will only have a format id so no real way to ask the type anything. And I would really like to avoid the case where we might have to do 2 object allocations to get a correct collation type (ie. one allocation on the format id, and then check something and then ask for another object based on the current object). The goal for 1, 2 and 3 should be a single object allocation given the information provided and returning a correct object with correct collation info, with as little or no overhead as possible to datatypes that don't really care about collation info. The getNewEmpty() method will copy the > RuleBasedCollator info to a new instance of StringDataValue and return > that." Did I understand you right? Also, I am just curious where is this > code dvd.getClass().newInstance() right now? The code is in java/engine/org/apache/derby/impl/store/access/conglomerate/TemplateRow.java!newRow() The comment says that it is more efficient to allocate new objects based on existing template object than calling the monitor. This was the observation from measurements a long time ago - no idea if they are still valid in newest JVM's. > > thanks, > Mamta > > On 4/12/07, *Mike Matrigali* > wrote: > > > > Mamta Satoor wrote: > > Mike, the following code will be part of DataValueFactory and > hence it > > will be part of the interface. Please let me know if I am not > very clear > > with what I am proposing or if you forsee problems with this logic. > > if (dvd instanceof StringDataValue) > > dvd = dvd.getValue(dvf.getCharacterCollator(type)); > > My comment isn't really the logic, I think we are just not talking about > the same area. I think the code above belongs hidden behind the new > interfaces in the implementation logic of the data factory and data > types, not an example of what callers of the datatype should be doing. > > > > Also, in the following line below > > "I'll look at building/using DataFactory interface. It will be some" > > you mean DataValueFactory interface, right? > > > > Mamta > > Yes I meant DataValueFactory interface. Let's work together on getting > the DataValueFactory interface right. > > So far I have uncovered to basic ways store creates "empty" objects. > Note that store really only needs "empty" objects, ie. it is going > to initialize the state of these objects from disk by calling each > objects readExternal() method. But we have decided to not store > the collation info as state in the object so somehow we need to get > that info into the empty objects. > > The ways store currently creates these objects: > > 1) using Monitor to get dvd directly: > dvd = Monitor.newInstanceFromIdentifier (format id) > > o I think this use is best implemented as Mamta suggests, just > providing a non-static interface on the DataValueFactory. > something like: > > DataValueFactory dvf = somehow cache and pass this around store; > dvd = dvf.newInstance(format id, collation id); > > at this point dvd can be used to correctly compare against other > dvd's in possible collate specific ways. > > 2) using existing dvd's class to get a new "empty" dvd that matches it > (which is why it does not call clone). > dvd = dvd.getClass().newInstance() > > o less sure about this one. Seems like we need a new dvd interface > that does the equivalent thing. I believe the original code got > here because the original store code did not deal with DVD's it > just got objects, so could not make dvd calls. There is a > getNewNull() interface, anyone know if there is any runtime work > that would be saved over this by creating a > getNewEmpty() interface? > > dvd = dvd.getNewEmpty(); > > at this point dvd can be used to correctly compare against other > dvd's in possible collate specific ways. > > 3) optimized allocation, caching some of the work. This is used > where one query may generate large number of rows - for instance > hash table scan and sorter calls. Here the idea is to do some > part of the work once leaving an InstanceGetter which then can > repeatedly give back new objects in the most optimized way: > > called once: > InstanceGetter = Monitor.classFromIdentifier(format id) > > called many times: > dvd = InstanceGetter.getNewInstance() > > o something like the following would be the direct conversion. Note > that implementation of the Instance getter is probably more complex > now. It can't just remember a single class and call new instance > on it. It has to cache some info on what class to create and what > collation to set in it. > > called once > DataValueFactory dvf = somehow cache and pass this around store; > InstanceGetter = > dvf.instanceGetterFromIdentifiers(format id, collation id) > > called many times: > dvd = InstanceGetter.getNewInstance() > > again at this point dvd can be used to correctly compare against other > dvd's in possible collate specific ways. > > > > All 3 of these uses have to be replaced to allow store to create > "correct" types which can be used in possible string comparisons. > > > >