incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Seaborne (JIRA)" <>
Subject [jira] [Updated] (JENA-89) Avoid a total sort for ORDER BY + LIMIT queries
Date Sat, 13 Aug 2011 16:58:27 GMT


Andy Seaborne updated JENA-89:

    Attachment:     (was: JENA-89-TopTests.patch)

> Avoid a total sort for ORDER BY + LIMIT queries
> -----------------------------------------------
>                 Key: JENA-89
>                 URL:
>             Project: Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Paolo Castagna
>            Assignee: Paolo Castagna
>            Priority: Minor
>              Labels: arq, optimization, scalability, sparql
>         Attachments: ARQ_JENA-89_r1156212.patch, JENA-89-Top-FixHeapCollation.patch,
> In case of SPARQL queries with ORDER BY + LIMIT, ARQ sorts the entire result set and
then produces the first N, according to the specified LIMIT.
> As an alternative, discussed on jena-dev [1], we can use a PriorityQueue [2] (which is
based on a priority heap) to avoid a sort operation.
> ARQ's algebra package contains already a OpTopN [3] operator. The OpExecutor [4] will
need to use a new QueryIterTopN instead of QueryIterSort + QueryIterSlice. A new TransformOrderByLimit
to be used by Optimize is also necessary.
> ORDER BY + LIMIT queries are typically used to construct the first page when results
are paginated. Then the following query is ORDER BY + OFFSET + LIMIT. (Often users stop at
the first page). Ideally, we could cache the ORDER BY and implement the OFFSET|LIMIT using
results from the cache. However, the improvement described by this issue is limited to the
ORDER BY + LIMIT case for which a priority heap is a good enough solution.
> Hopefully, this would improve the scalability of ORDER BY + LIMIT queries in case of
small values specified on the LIMIT. 
>   [1]
>   [2]
>   [3]
>   [4]

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message