openjpa-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Hardy <adam....@cyberspaceroad.com>
Subject Re: Performance with large lists
Date Fri, 18 Jun 2010 18:27:46 GMT
Thanks for the tip. I'll give it a try. What sort of performance do you get? Do
you do thousands in one submit?



C N Davies on 18/06/10 14:17, wrote:
> Not sure if you have the same issue, but I found breaking my commits down to
> 10 records per commit and calling em.clear() after each commit improved my
> performance and fixed a lot of dettached entity issues.
> 
> 
> 
> -----Original Message----- From: Adam Hardy
> [mailto:adam.sql@cyberspaceroad.com] Sent: Friday, 18 June 2010 10:11 PM To:
> users@openjpa.apache.org Subject: Re: Performance with large lists
> 
> I have a transaction that submits around 20K records, followed by another 
> transaction in the user process which inserts the same into a related table.
> 
> The performance issues I talked about below initially caused processing times
> of 45 mins, but I worked on the Java in the code and I tweaked the mySQL
> database and reduced it to 15 mins.
> 
> This though is still a problem - I've been over the optimization guidelines
> in the documentation and there's nothing there that I can implement that I'm
> not already.
> 
> The first transaction I mentioned takes 5 mins, but the second takes 15mins
> and is inserting child records of the records created in the first
> transaction. It looks like OpenJPA is fetching all of those first records
> again. Shouldn't they already be in memory?
> 
> 
> Thanks Adam
> 
> Adam Hardy on 12/06/10 13:21, wrote:
>> I am trying to get a handle on what I should be able to achieve.
>> 
>> Can someone give me some idea of the metrics I should be able to get 
>> optimistically when persisting an object that has a child list with 20,000
>> child objects and 20,000 grandchildren? (one-to-one child -> grandchild)
>> 
>> Can I reasonably expect to get this done in under a minute?
>> 
>> I think that would work out at a rate of about 1.5 milliseconds per object.
>> 
>> 
>> Thanks Adam
>> 
>> Adam Hardy on 11/06/10 17:34, wrote:
>>> I have performance problems with large lists of beans due to the base 
>>> class I am using for my entities.
>>> 
>>> This is slightly non-OpenJPA specific, so I hope nobody minds, but it is
>>> Friday afternoon so I'm hoping you give me a bit of slack here.
>>> 
>>> The problem arises when I start building lists with over 10,000 items on
>>> a parent class.
>>> 
>>> The trouble is in the base class for the entities, which is quite clever
>>> (but obviously not clever enough) and it has non-facile equals() and
>>> hashcode() algorithms which make use of reflection. It's here that the
>>> slow-down comes.
>>> 
>>> When I link the child with a parent that already has 10,000 children, the
>>> equals() method is called by ArrayList before the new child is placed in
>>> the index.
>>> 
>>> As far as I can tell I have a couple of options.
>>> 
>>> (1) ditch the reflection-based equals method and hard-code an equals 
>>> method.
>>> 
>>> (2) don't use ArrayList but find a Collection-based class that uses 
>>> hashes or similar to identify items instead of equals. This is just 
>>> speculation - perhaps there is no such thing or it wouldn't help anyway:
>>> 
>>> - would a collection using hashes caches the hashes of the items already
>>> indexed? - would such a collection be persistable?
>>> 
>>> If anyone has  been in this situation before, or has an idea about it ,
>>> I'd really appreciate the help.
> 
> 


Mime
View raw message