openjpa-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Hardy <adam....@cyberspaceroad.com>
Subject Re: Performance with large lists
Date Fri, 18 Jun 2010 12:11:18 GMT
I have a transaction that submits around 20K records, followed by another 
transaction in the user process which inserts the same into a related table.

The performance issues I talked about below initially caused processing times of 
45 mins, but I worked on the Java in the code and I tweaked the mySQL database 
and reduced it to 15 mins.

This though is still a problem - I've been over the optimization guidelines in 
the documentation and there's nothing there that I can implement that I'm not 
already.

The first transaction I mentioned takes 5 mins, but the second takes 15mins and 
is inserting child records of the records created in the first transaction. It 
looks like OpenJPA is fetching all of those first records again. Shouldn't they 
already be in memory?


Thanks
Adam

Adam Hardy on 12/06/10 13:21, wrote:
> I am trying to get a handle on what I should be able to achieve.
> 
> Can someone give me some idea of the metrics I should be able to get 
> optimistically when persisting an object that has a child list with 
> 20,000 child objects and 20,000 grandchildren? (one-to-one child -> 
> grandchild)
> 
> Can I reasonably expect to get this done in under a minute?
> 
> I think that would work out at a rate of about 1.5 milliseconds per object.
> 
> Thanks
> Adam
> 
> Adam Hardy on 11/06/10 17:34, wrote:
>> I have performance problems with large lists of beans due to the base 
>> class I am using for my entities.
>>
>> This is slightly non-OpenJPA specific, so I hope nobody minds, but it 
>> is Friday afternoon so I'm hoping you give me a bit of slack here.
>>
>> The problem arises when I start building lists with over 10,000 items 
>> on a parent class.
>>
>> The trouble is in the base class for the entities, which is quite 
>> clever (but obviously not clever enough) and it has non-facile 
>> equals() and hashcode() algorithms which make use of reflection. It's 
>> here that the slow-down comes.
>>
>> When I link the child with a parent that already has 10,000 children, 
>> the equals() method is called by ArrayList before the new child is 
>> placed in the index.
>>
>> As far as I can tell I have a couple of options.
>>
>> (1) ditch the reflection-based equals method and hard-code an equals 
>> method.
>>
>> (2) don't use ArrayList but find a Collection-based class that uses 
>> hashes or similar to identify items instead of equals. This is just 
>> speculation - perhaps there is no such thing or it wouldn't help anyway:
>>
>>     - would a collection using hashes caches the hashes of the items 
>> already indexed?
>>    - would such a collection be persistable?
>>
>> If anyone has  been in this situation before, or has an idea about it 
>> , I'd really appreciate the help.


Mime
View raw message