jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: Query Performance and Optimization
Date Fri, 02 Mar 2007 09:58:04 GMT

On 2/28/07, David Johnson <dbjohnson.e@gmail.com> wrote:
> "select * from Column where jcr:path like 'Gossip/ColumnName/Columns/%' and
> status <> 'hidden' order by publishDate desc" takes 500 ms to execute - this
> is just the execution time, I am not actually using or accessing the
> NodeIterator.

Are you using Jackrabbit 1.2.x? Jackrabbit 1.2 uses lazy loading of
query results, which should considerably reduce query execution time
by moving the effort to the resulting Node- or RowIterator.

In general my rule of thumb so far has been to use the query feature
when you want a narrow selection of nodes from a large source set, and
to use explicit traversal with filtering when the expected result set
includes a considerable percentage of the source set. Optimally the
query feature should in all cases be at least equal to traversal speed
plus a small constant query parsing and setup overhead. I don't think
we are there yet.

> Digging into the internals of Jackrabbit, we have noticed that there is an
> implementation of RangeQuery that essentially walks the results if the # of
> query terms is greater than what Lucene can handle.  Reading the Lucene
> documentation, it looks like Filters are the recommended method of
> implementing "large" range queries, and also seem like a natural for
> matching node types - i.e., select * from Column

I'm not too familiar with Lucene details to comment on whether Filters
would cover everything we need. It would be great if you're interested
in pursuing such alternatives!

> Is there any ongoing work on query optimization and performance.  We would
> be very interested in such work, including offering any help that we can.

Not apart from the recent lazy loading improvements. Any help would be
much appreciated.


Jukka Zitting

View raw message