incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject SPARQL queries with ORDER BY + LIMIT
Date Fri, 05 Aug 2011 14:33:24 GMT
when someone submits a SPARQL query which contains an ORDER BY + LIMIT,
ARQ sorts the entire and then takes the first N, according to the specified

If the ORDER BY operates over a large number of bindings, these sort of
queries can become a problem. We see ORDER BY queries with small LIMITs
and often enough no OFFSET specified.

Fortunately, in the ARQ algebra package there is already a OpTopN [1]
operator, which currently get executed as a sort followed by a slice,
see OpExecutor [2]:

  protected QueryIterator execute(OpTopN opTop, QueryIterator input)
      // XXX Do better; execute - OpTopN
      QueryIterator qIter = executeOp(opTop.getSubOp(), input) ;
      qIter = new QueryIterSort(qIter, opTop.getConditions(), execCxt) ;
      qIter = new QueryIterSlice(qIter, 0, opTop.getLimit(), execCxt) ;
      return qIter ;

The comment suggests something better is possible. :-)

We could avoid the sort, perhaps using a heap [3] and, in Java, a
PriorityQueue [4,5]. We would need to add a QueryIterTopN and use it
instead of the QueryIterSort + QueryIterSlice.

 "It is a common misconception that a priority queue is a heap.
  A priority queue is an abstract concept like "a list" or "a map";
  just as a list can be implemented with a linked list or an array,
  a priority queue can be implemented with a heap or a variety of
  other methods." [4]


  A Java PriorityQueue is "an unbounded priority queue based on a
  priority heap." [5]

The other thing to do is to spot an ORDER BY + LIMIT (with a sufficiently
small LIMIT?) and use the OpTopN instead of the OpOrder and OpSlice.
This would need to be done in a new TransformOrderByLimit or TransformTopN
transformation to be used by (in which position?).

Does this seem a reasonable thing to do and a reasonable way to do it?



View raw message