db-ojb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lance Eason" <lance.ea...@whisperwire.com>
Subject RE: Some performance tuning changes
Date Tue, 13 May 2003 17:17:10 GMT
Thomas,

Just to be clear my change 1) doesn't involve switching entirely to indexes instead of names.
 I very much think OJB does the right thing already using names so that columns aren't required
to appear in a certain order within the SQL statements.  All my change does is recognize that
within a given ResultSet represented by a RsIterator the ordering of columns isn't going to
change from row to row so there's no reason to lookup columns by name on each individual row.
 Instead you can convert the names to indices at the beginning of the iteration and then on
a row by row basis use the indices which is much faster.

You bring up a good point about the dynamic nature of the metadata though and reading through
the code it wasn't entirely clear to me which things were supposed to be able to change dynamically
and which things weren't in regards to fields.  First off is simply the fact that ClassDescriptor
only allows you to add field descriptors not replace or remove existing field descriptors.
 Also within ClassDescriptor it keeps some cached arrays/lists of FieldDescriptors (primary
key fields, locking fields, etc.) that weren't being recalculated when new field descriptors
were added (this is another of the changes that I made).  Finally the method that RowReaderDefaultImpl
uses to get fields, DescriptorRepository.getFieldDescriptorsForMultiMappedTable, keeps a cached
list of field descriptors for the class that never seems to be updated as ClassDescriptors
change.  Given all this it wasn't clear to me if the addFieldDescriptor was intended solely
to be a builder type method used during construction of the FieldDescriptor or if it was really
intended to support dynamic changes.

My change right now does depend on the current behavior of getFieldDescriptorsForMultiMappedTable,
that it will always return the same list of FieldDescriptors for a given ClassDescriptor,
i.e. it never changes dynamically.  If this isn't the intended behavior then I think there's
still the opportunity for remembering indices but the implementation would have to change
slightly, basically adding the list of FieldDescriptors to that 'cookie' state that's being
handed back to the RsIterator in addition to the column indices themselves.

later,
- Lance

-----Original Message-----
From: Mahler Thomas [mailto:thomas.mahler@itellium.com]
Sent: Tuesday, May 13, 2003 4:39 AM
To: 'OJB Developers List'
Subject: RE: Some performance tuning changes


Hi all,

thanks Armin for looking at this stuff. Unfortuantely I'm currently very
busy with a real-world project and won't find the the time to review Lance's
patches.

But from reading through his mail I feel that they contain some important
improvements.

The only thing I'm having problems with is step 1).
A long time ago we dropped indexed access and started to use access by name
as we had a lot of problems with dynamic modification of class- and
field-descriptors.
If you remove a fielddescriptor and use indexed access you are lost!
So any improvement in this area must take care nit to compromise dynamic
changes to the metaddata layer!

cheers,
Thomas

> -----Original Message-----
> From: Armin Waibel [mailto:armin.waibel@code-au-lait.de]
> Sent: Monday, May 12, 2003 8:37 PM
> To: OJB Developers List
> Subject: Re: Some performance tuning changes
> 
> 
> Hi Lance,
> 
> thanks for your excellent improvements and the
> detailed description.
> Intermediate result:
> 
> [* ??? not yet done]
> 
> 1) RowReaderDefaultImpl retrieves values from the....
> * ???
> 
> 2) Three methods on ClassDescriptor...
> - OK, is done
> 
> 3) ObjectCacheDefaultImpl and ObjectCachePerBrokerImpl both call
> toString() on the Identity....
> - OK, is done
> 
> 4) LoggerFactoryImpl.getLogger concatenates a bunch of Strings ....
> - Curently the used BootLogger instance (PoorMansLoggerImpl)
> does not support isDebugEnabled/isEnabled methods, thus this
> does not increase performance. Further on the log-statements
> were called once per class, after it the logger was cached.
> 
> 5, 6 and 7) PersistenceBrokerImpl is creating iterators to 
> iterate ....
> * ???
> 
> 8) SqlHelper.cleanPath calls splitPath
> * ???
> 
> 9) PersistenceBrokerAbstractImpl manages firing off events during the
> lifecycle ...
> - OK, is done
> 
> 10) ObjectReferenceDescriptor.getItemProxyForClass uses whether the
> proxy ...
> * ???
> 
> regards,
> Armin
> 
> ----- Original Message -----
> From: "Lance Eason" <lance.eason@whisperwire.com>
> To: "OJB Dev Mailing List (E-mail)" <ojb-dev@db.apache.org>
> Sent: Monday, May 12, 2003 8:00 AM
> Subject: Some performance tuning changes
> 
> 
> I've been profiling my system with some large queries (returning
> thousands of items each).  As a result I ended up making a few code
> changes to tune OJB.  All told they improve OJB's performance by about
> 20% and reduce object creation by about the same for the 
> scenarios I've
> tested, nothing earth-shattering but every bit helps.  Only two of the
> changes required any interface changes, but one of those is also the
> single most significant change.
> 
> Could some committer review and add them?  All changes are 
> made against
> the latest from CVS as of this afternoon.  Because of mailer 
> limitations
> I'm sending the files as a follow-up to this message.
> 
> 
> The changes I made, from most significant to least significant:
> 
> 1) RowReaderDefaultImpl retrieves values from the result set 
> via column
> name instead of column index.  This quickly showed up as a hotspot in
> the profiler.  While it's definitely a lot more robust to use the name
> instead of the index there's no reason to pay the overhead of the name
> lookup (which is significant) on each and every row.  The names can be
> converted to indices once up front using 
> ResultSet.findColumn(name) and
> then values can be retrieved by index for each row.  Since the same
> RowReader may service multiple RsIterators at the same time and
> different RsIterators may have different indices for the same column,
> the indices have to be local to the RsIterator somehow.  I went with a
> sort of cookie approach where the RsIterator asks the 
> RowReader to index
> the columns and then hands that index back to the RowReader on the
> readObjectArrayFrom method.  So I added two method signatures to
> RowReader:
> 
>     public int[] index(ResultSet rs);
>     public void readObjectArrayFrom(ResultSet rs, int[] 
> index, Map row);
> 
> and changed RowReaderDefaultImpl, RsIterator and SqlBasedRsIterator
> appropriately.  The other alternative would have been to instantiate a
> RowReader for each RsIterator and keep this information inside the
> RowReader itself, but that instantiation would have had a 
> cost as well.
> I left the original readObjectArrayFrom() signature as well 
> and left the
> PersistenceBroker.getObjectByIdentity() path using it since for single
> item queries there is probably a very small performance 
> advantage to the
> original method.
> 
> Files modified:
>    RowReader - added methods index and readObjectArrayFrom w/ index
>    RowReaderDefaultImpl - implemented index and readObjectArrayFrom w/
> index
>    RsIterator - call RowReader to build index for ResultSet in
> constructor and call new readObjectArrayFrom
>    SqlBasedRsIterator - call RowReader to build index for ResultSet in
> constructor
> 
> 
> 
> 2) Three methods on ClassDescriptor, getFieldDescriptorByName,
> getObjectReferenceDescriptorByName, and getCollectionDescriptorByName
> all have the same fundamental problem.  They each try to build a
> descriptor by name map lazily.  To do this they look for the 
> name in the
> map and if they don't find it scan through all the descriptors and if
> found add it to the map.  The problem with this is that queries for
> things where the descriptor doesn't exist always end up scanning the
> entire set of descriptors in order to determine that it doesn't exist.
> In other words this strategy works fine for positive queries 
> but is very
> poor for negative queries.  It turns out that a fair number 
> of negative
> queries do occur though for things like the field descriptor for
> OJB_CONCRETE_CLASS.  I also fixed a bug where ClassDescriptor doesn't
> rebuild it's cached lists of FieldDescriptors when new field 
> descriptors
> are added and I cached the result of isAbstract.
> 
> Files modified:
>    ClassDescriptor
>       - changed map building strategy in 
> getXXXDescriptorByName methods
>       - changed setClassOfObject and isAbstract to cache value of
> isAbstract
>       - changed addFieldDescriptor to clear out cached lists of field
> descriptors and clear the name map
>       - changed addCollectionDescriptor and 
> addObjectReferenceDescriptor
> to clear their respective name maps
> 
> 
> 
> 3) ObjectCacheDefaultImpl and ObjectCachePerBrokerImpl both call
> toString() on the Identity passed in and use that as the key to their
> internal maps.  In addition Identity calls toString() on itself to
> calculate it's hashCode.  Converting the Identity objects to Strings
> turns out to be much more costly than simply comparing their contents.
> Compounding matters is the fact that a lot of Identity objects get
> constructed solely to do cache lookups and then are immediately
> discarded but they get unnecessarily converted to String in 
> the process
> at extra cost in processing and object creation.  I changed the two
> caches to use the Identity object itself as the key and changed
> Identity.hashCode() to calculate the hashCode directly rather than
> converting to a String.
> 
> Files modified:
>    Identity - changed hashCode() method
>    ObjectCacheDefaultImpl - removed all the toString()'s on Identity
> objects
>    ObjectCachePerBrokerImpl - ditto
> 
> 
> 
> 4) LoggerFactoryImpl.getLogger concatenates a bunch of 
> Strings for debug
> messages without checking to see if debug verbosity is enabled.  I
> protected all the calls with an isEnabledFor(Logger.DEBUG) check.
> 
> Files modified:
>    LoggerFactoryImpl - check isEnabledFor(Logger.DEBUG) before doing
> String concatenations
> 
> 
> 
> 5, 6 and 7) PersistenceBrokerImpl is creating iterators to iterate
> through Vectors returned from ClassDescriptor methods.  I 
> changed it to
> get elements by index instead (which is significantly faster).  Even
> better would be if the ClassDescriptor methods returned arrays of the
> appropriate type instead of Vectors (this would be more 
> consistent with
> the rest of the ClassDescriptor interface as well as performing better
> and being more type-safe), but this would've required an interface
> change to ClassDescriptor.
> 
> Also a lot of logger.debug messages were added to 
> getRsIteratorFromQuery
> that are doing String concatenations.  I protected all these 
> calls with
> logger.isDebugEnabled() checks.
> 
> When returning a collection proxy PersistenceBrokerImpl/ProxyHelper
> always looks up the constructor again each time though the 
> answer never
> changes.  I changed it to remember the constructors and use those
> instead of looking up the constructors from the classes each time.
> 
> Files modified:
>    PersistenceBrokerImpl - replaced iterators with get elements by
> index, remembered proxy collection constructors and protected
> logger.debug messages
>    ProxyHelper - remember constructors for standard proxy collections
> 
> 
> 
> 8) SqlHelper.cleanPath calls splitPath.  This results in lots of
> PathInfo objects being created needlessly and immediately discarded
> because cleanPath is only interested in the column portion.  I changed
> it to directly calculate the column portion itself and to avoid two
> indexOf operations in both cleanPath and splitPath when unnecessary.
> 
> Files modified:
>    SqlHelper - optimized cleanPath and splitPath
> 
> 
> 
> 9) PersistenceBrokerAbstractImpl manages firing off events during the
> lifecycle of a PersistenceBroker.  There are three opportunities for
> performance improvements here.  First it iterates through a map every
> time an event is fired.  Since listeners are added rarely but 
> events are
> fired frequently it's much better to construct a new array each time a
> listener is added and iterate through an array on event 
> firing.  Second,
> listeners are stored in a map with a Boolean indicating 
> whether they are
> permanent or not.  When the broker is released this map is iterated
> through and all non-permanent listeners are discarded.  It's better to
> keep two separate arrays and just discard the non-permanent 
> array rather
> than doing this iteration.  Finally, a single fireBrokerEvent 
> method is
> provided which takes a PersistenceBrokerEvent.  Internally this method
> does two instanceof operations to decide how to dispatch the event.
> Letting the compiler decide how to dispatch at compile time 
> by providing
> overloaded signatures of fireBrokerEvent is more efficient.  Similar
> things could be done for add and removeListener, but I didn't bother
> since these will be called relatively rarely.
> 
> Files modified:
>    PersistenceBrokerAbstractImpl - changed storage of listeners to
> arrays instead of Maps, added overloaded fireBrokerEvent methods
>    PersistenceBroker - added overloaded fireBrokerEvent methods
>    DelegatingPersistenceBroker - added overloaded fireBrokerEvent
> methods
>    PersistenceBrokerBean - added overloaded fireBrokerEvent methods
> 
> 
> 
> 10) ObjectReferenceDescriptor.getItemProxyForClass uses whether the
> proxy is null or not as a test for whether it's been looked up.  For
> classes that don't have a proxy though the proxy will always 
> be null so
> it will do that lookup again every time the method is called. 
>  I changed
> it to have a boolean indicating whether the proxy had been looked up
> instead.
> 
> Files modified:
>    ObjectReferenceDescriptor - changed getItemProxyForClass to use a
> boolean to indicate whether the proxy has been looked up
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: ojb-dev-unsubscribe@db.apache.org
> For additional commands, e-mail: ojb-dev-help@db.apache.org
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: ojb-dev-unsubscribe@db.apache.org
> For additional commands, e-mail: ojb-dev-help@db.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-dev-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-dev-help@db.apache.org


Mime
View raw message