From: Dag H. Wanvik <Dag.Wanvik@Sun.COM>
To: Derby Discussion <firstname.lastname@example.org>
Sent: Monday, May 4, 2009 5:19:15 AM
Subject: Re: The result offset and fetch first clauses
Geoff hendrey <email@example.com
> 1) can I use this feature even if I am ordering on multiple columns?
As Tiago said, the feature works on all result sets, including
> 2) what are the performance implications for users of the embedded
> driver? In particular, with the embedded driver I am hoping that
> this feature allows portions of a result set to be retrieved without
> the overhead of retrieving the entire result set. For example, if I
I am afraid that with embedded driver, you will only save a little CPU
(by avoiding some JDBC calls) since under the hood, the code siphons
off the rows till it hits the offset, so if you have a large offset,
you will still incur reading of those rows (modulo page caching). In
client/server driver context the savings are
larger, of course, in
that fewer rows are sent over the wire. For simple queries that can
use an index, the optimizer could make use of the offset information
to avoid reading the entire row when skipping rows before offset, just
counting rows in the index to get to the first qualifying row, but
this optimization is not yet implemented.
Often, this feature is used together with ORDER BY which would entail
some sorting of the result set and then all the rows would have to be
read anyway. Again, for some simple queries, sort avoidance is used by
the optimizer, so optimization is still possible for for such queries.
If you think this optimization is an important capability feel free to
file an improvement issue for it.
Any rows after the FETCH n rows are not processed for simple
queries. For complex queries they might be read to perform sorting.
> If (2) is not efficient, how does it compare to
the efficiency of the following approach:
> Get the result set. Use a loop to increment integer n by PAGE_SIZE,
> and inside the loop use ResultSet.absolute(n) combined with
> stmt.setFetchSize(1) to retrieve a "marker" row that signifies the
> begining of each "page" of the result set. I use the primary keys of
> these "markers" as page boundaries so that my web application can
> provide links to a set of pages evenly distributes throughout the
> result set.
This sounds like a resonable approach to me, when any concurrent
insert/deletes concerns are addressed, or are irrelevant.