jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Searching for a property
Date Thu, 03 Dec 2009 09:04:45 GMT
Hello Dennis,

it's because your using 2 times a child axis query (jcr:root and one
within the where clause) that makes it slow and. Explaining why is out
of scope for the user list, but I wrote quite some time ago a few
guidelines (most of them still valid):

http://n4.nabble.com/Explanation-and-solutions-of-some-Jackrabbit-queries-regarding-performance-td516614.html#a516614

I am not sure what nodetype jcr:content is, but suppose: my:contenttype

now, if your query would be:

//element(*,my:contenttype)[fn:lower-case(@cms:virtualPath)= '" + vpath + "']";

the query will be instant. Just take the parent node of the result and
you should be fine. Just wondering, are you building a brand new cms
on jcr? I am not sure what the @cms:virtualPath holds, but if you also
need virtual environments showing the same jcr nodes in different tree
structures you might wanna take a look here [1].

Regards Ard

[1] http://www.onehippo.org/cms7


On Thu, Dec 3, 2009 at 9:47 AM, Dennis van der Laan
<d.g.van.der.laan@rug.nl> wrote:
> Hi,
>
> It seems querying on a property is very slow on our system (running
> Jackrabbit 1.6.0): almost 1 second per query which would normally return
> 0 or 1 result.
>
> We use jcr:content nodes of type nt:unstructured to store the contents
> of a file (in the jcr:data property) and we store an array of Strings in
> a property cms:virtualPath on the same node. So basically, every file in
> our repository has the JCR path and zero or more virtual paths in the
> cms:virtualPath property. If we want to add a virtual path to a file, we
> have to check if the virtual path does not exist already. For this, we
> use an XPath query in the following code snippet:
>
> String vpath = QueryUtil.escapeForAttributeSearch(path.toLowerCase());
> String query =
> "/jcr:root//element(*,nt:hierarchyNode)[fn:lower-case(jcr:content/@cms:virtualPath)
> = '" + vpath + "']";
> Query q = queryManager.createQuery(query, Query.XPATH);
> NodeIterator ni = q.execute().getNodes();
> if (ni.getSize() == 0) {
>    throw new ItemNotFoundException("Unable to find item by virtual
> path: " + path);
> }
> else if (ni.getSize() > 1) {
>    throw new IllegalStateException("More than 1 item on virtual path: "
> + path);
> }
> else {
>    return ni.nextNode();
> }
>
> Our repository now contains around 500,000 virtual paths, more or less
> divided over 150,000 files which are evenly distributed over more than
> 1000 (nested) folders.
>
> The repository runs on an Intel Nehalem Xeon (2 x 2.5GHz) running
> Solaris 10 and the repository database (for datastore, filesystem, etc)
> runs on the same specs, on a different server, running Oracle 10g.
>
> When we try to add virtual paths in a batch (about 2000 virtual path
> properties for 1000 files) and all virtual paths already exist (so the
> above query returns 1 virtual path), we see a 100% load of the our
> Tomcat application (which means 1 core fully utilized).
>
> I would expect a JCR repository to be able to handle this kind of
> queries. How are these properties indexed? Is it possible to optimize
> the repository for this kind of queries? Or should I use a different
> query? The alternative would be to keep a different database which keeps
> track of the virtual paths, but keeping that in sync with the JCR
> repository would be a pain, at the least.
>
> Thanks for your ideas about this issue,
> Kind regards,
> Dennis van der Laan
>

Mime
View raw message