jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@day.com>
Subject Re: How to avoid sequential scans in queries
Date Fri, 15 Oct 2010 10:25:05 GMT
On Fri, Oct 15, 2010 at 12:18, hwellmann <harald.wellmann@multi-m.de> wrote:
> I've started with a simple example, creating a repository with 50000 nodes

Are these nodes directly under one node in a flat hierarchy? Note that
you should not have more than ~10k nodes per child node (search the
mailing list for "flat hierarchy" for more info).

> select * from 'nt:unstructured' where myProp = 'myValue'

This is JCR-SQL from the JCR 1.0 spec, right?

> The query returns the expected results, but it is awfully slow.
>
> Stepping through the code, it seems to me that Jackrabbit builds a Lucene
> query for items with node type 'nt:unstructured' and then iterates over the
> result set with 50000 matches to filter the nodes by the property
> constraint.

This is not expected, for simple property lookups it should use the
lucene index.

Could you try an Xpath query and compare the execution times?

//element(*, nt:unstructured)[@myProp='myValue']

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Mime
View raw message