cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject Re: paged query slow when fetching big lists
Date Fri, 22 Jun 2007 15:10:15 GMT
Hi Marcin,

> * fetch only Pk columns and create all ObjectIds at once, get rid  
> of the iterating process if possible
> * use already existing method resolveInterval() to fault the  
> required range of records

This strategy was discussed in the May thread with Ari (the one that  
Michael Gentry mentioned). My vote is +0, meaning that before we make  
this change, I want to confirm first that it has a visible impact on  
performance. Could you possibly make such change locally and see if  
it helps? (Look at SelectQuery.addCustomDbAttribute() to only include  
PK; if you have problems making this change, ping me via the dev list  
- I'll try my best to help).


> If the creation of ObjectId and getting the results from ResultSet  
> cannot be speed up (because it simply has to happen, and it does  
> not depend on the way it is done), the only choice will be to  
> implement some more complex solution using sql LIMIT statement.

I'd love to avoid that, as the data you get may as well be different  
the next time you resolve a page, so you may end up with duplicates  
or skipped records. If we ever go this way, we'll probably need to  
make it a user choice (use LIMIT vs. IncrementalFaultList).

Andrus



On Jun 22, 2007, at 2:35 AM, Marcin Skladaniec wrote:

> Hi
> Recently we have found that fetching a list of 100,000 records  
> using ROP with paging and no cache takes a long time, about 50  
> seconds in our case. We have profiled the cpu usage and the result  
> shows that 99% of time is spent in IncrementalFaultList, within the  
> fillIn() method.
>
> The fillIn method works (in my opinion) in a bit strange fashion:  
> it does execute query at once, stores the query result in  
> java.sql.ResultSet, and than iterates through the result either  
> creating the whole DataRow or just ObjectId. If there is a need the  
> DataRows are faulted at the end of the method.
> From our testing it came up that this bit of code :
>
> while (it.hasNextRow()) {
> 	elements.add(it.nextObjectId(entity));
> }
>
> is where all the time is spent. Each iteration in this loop takes  
> about 0.5ms, which multiplied by 100,000 takes almost 50 seconds.
> nextObjectId method consists of two parts: fetching the next result  
> from ResultSet and creating a ObjectId, but I was unable to check  
> which one takes the most time, anyway I think that this approach is  
> somewhat wrong, since always 99% of the records will be fetched as  
> ObjectId and never faulted, so my ideas to enhance this are:
> * fetch only Pk columns and create all ObjectIds at once, get rid  
> of the iterating process if possible
> * use already existing method resolveInterval() to fault the  
> required range of records
> If the creation of ObjectId and getting the results from ResultSet  
> cannot be speed up (because it simply has to happen, and it does  
> not depend on the way it is done), the only choice will be to  
> implement some more complex solution using sql LIMIT statement.
>
> I would like to mention that we are using some DataContext  
> decorators and life-cycle callbacks, but I don't believe those are  
> important factors in this case.
>
> Whatever is the solution, i think it is pretty crucial that it will  
> be implemented soon, since the usability of the ROP without fast  
> paging is rather low.
>
> With regards
> Marcin
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001   fax +61 2 9550 4001
>
>
>


Mime
View raw message