incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paolo Castagna (JIRA)" <>
Subject [jira] [Created] (JENA-89) Avoid a total sort for ORDER BY + LIMIT queries
Date Mon, 08 Aug 2011 08:38:27 GMT
Avoid a total sort for ORDER BY + LIMIT queries

                 Key: JENA-89
             Project: Jena
          Issue Type: Improvement
          Components: ARQ
            Reporter: Paolo Castagna
            Assignee: Paolo Castagna
            Priority: Minor

In case of SPARQL queries with ORDER BY + LIMIT, ARQ sorts the entire result set and then
produces the first N, according to the specified LIMIT.
As an alternative, we can use a [PriorityQueue|]
(which is based on a priority heap) to avoid a sort operation.

ARQ's algebra package contains already a [OpTopN|]
operator. The [OpExecutor|]
will need to use a new QueryIterTopN instead of QueryIterSort + QueryIterSlice. A new TransformOrderByLimit
to be used by Optimize is also necessary.

ORDER BY + LIMIT queries are typically used to construct the first page when results are paginated.
Then the following query is ORDER BY + OFFSET + LIMIT. (Often users stop at the first page).
Ideally, we could cache the ORDER BY and implement the OFFSET|LIMIT using results from the
cache. However, the improvement described by this issue is limited to the ORDER BY + LIMIT
case for which a priority heap is a good enough solution.

Hopefully, this would improve the scalability of ORDER BY + LIMIT queries in case of small
values specified on the LIMIT. 

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message