db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel John Debrunner <...@debrunners.com>
Subject Re: derby performance and 'order by'
Date Mon, 19 Sep 2005 17:46:14 GMT
Suavi Ali Demir wrote:

> Actually, it sounds like the problem of finding top 1000 rows out of 
> 166333 rows is different than sorting 166333 rows and maybe it could be
> optimized. There is no need to sort all 166333 but the information that
> we are only looking 1000 rows would have to be passed all the way down
> to the point where Derby decides to sort. I have not thought through the
> details of an algorithm but when nRows we want is substantially smaller
> than TotalRows then i just feel there should be a better way to pick
> those nRows. For example, if nRows were 1, then all we had to do would
> be 1 single pass on 166333 rows to find the max. That is quite different
> than sorting all and this idea should be possible to generalize on
> 1<=nRows<TotalRows.

One optimization would be to pass the 1,000 down from
Statement.setMaxRows to the sorter. Then the sorter could keep the sort
set at 1000 rows, discarding any rows moved to or inserted at the end.
This would most likely enable the sorted set to remain in memory, rather
than spilling to disk. Even more likley if the application sets a max
rows to a reasonable number to be views in a single on-screen page (e.g.
20-50).

Or an even wilder idea would be to have a predicate that is modified on
the fly, to represent the maximum once the sorted set reaches 1,000.
E.g. an additional predicate of <= MAX_INT for an INT column.

Dan.


Mime
View raw message