jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sbarriba" <sbarr...@yahoo.co.uk>
Subject Is using jcr:path in a query a performance bottleneck?
Date Fri, 23 Jan 2009 16:59:30 GMT
Hi all,

I've included some background below but my question is..... "is using
jcr:path in a query to be avoided due to performance"?

 

We have some Jackrabbit repositories which have grown to include circa
200,000 nodes of a acme:Story. The nodes have been structured in a deep
hierarchy to comply with Jackrabbit best practises in terms of max nodes per
folder. As an example of our JCR structure

/library/

news/

entertainment/

sport/

tennis/

               2009/

                              01/

Football/

..

 

 

When accessing the repository we're typically using queries which are a)
using path as a restriction clause and b) ordering by a date field.

 

e.g. get the latest sport items.

 

SELECT * FROM acme:Story WHERE jcr:path LIKE '/library/sport/%' ORDER BY
acme:createdDate DESC

 

As the number of stories has increased we're seeing more and more incidents
of queries exceeding 2 secs. Furthermore we've seen lucene start to use more
and more heap space executing these queries. Based on
http://www.nabble.com/Explanation-and-solutions-of-some-Jackrabbit-queries-r
egarding-performance-td15028655.html I understand that the above query would
be much faster if we added a tag/category type property to the object e.g.
acme:category = "Sport" so that the query could be

 

SELECT * FROM acme:Story WHERE acme:category LIKE 'Sport' ORDER BY
acme:createdDate DESC

 

Is that a fair assessment?

 

All comments appreciated.

Regards,

Shaun

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message