jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis van der Laan <d.g.van.der.l...@rug.nl>
Subject Re: Searching for a property
Date Fri, 11 Dec 2009 10:24:57 GMT
Hi Ard,

Thanks! The performance went up by a factor x10. Still not what I hoped
for, but I'm not sure the query itself is still a problem.

A related question: could it be that when a query returns no results,
this is slower than when it does return a result? Might it have
something to do with Lucene not having an index for that particular
property value?

> Hello Dennis,
> it's because your using 2 times a child axis query (jcr:root and one
> within the where clause) that makes it slow and. Explaining why is out
> of scope for the user list, but I wrote quite some time ago a few
> guidelines (most of them still valid):
> http://n4.nabble.com/Explanation-and-solutions-of-some-Jackrabbit-queries-regarding-performance-td516614.html#a516614
> I am not sure what nodetype jcr:content is, but suppose: my:contenttype
> now, if your query would be:
> //element(*,my:contenttype)[fn:lower-case(@cms:virtualPath)= '" + vpath + "']";
> the query will be instant. Just take the parent node of the result and
> you should be fine. 

> Just wondering, are you building a brand new cms
> on jcr? I am not sure what the @cms:virtualPath holds, but if you also
> need virtual environments showing the same jcr nodes in different tree
> structures you might wanna take a look here [1].
We're not building a brand new CMS, we're migrating our old Oracle iFS
storage to a JCR repository. The CMS itself stays the same.

> Regards Ard
> [1] http://www.onehippo.org/cms7
> On Thu, Dec 3, 2009 at 9:47 AM, Dennis van der Laan
> <d.g.van.der.laan@rug.nl> wrote:
>> Hi,
>> It seems querying on a property is very slow on our system (running
>> Jackrabbit 1.6.0): almost 1 second per query which would normally return
>> 0 or 1 result.
>> We use jcr:content nodes of type nt:unstructured to store the contents
>> of a file (in the jcr:data property) and we store an array of Strings in
>> a property cms:virtualPath on the same node. So basically, every file in
>> our repository has the JCR path and zero or more virtual paths in the
>> cms:virtualPath property. If we want to add a virtual path to a file, we
>> have to check if the virtual path does not exist already. For this, we
>> use an XPath query in the following code snippet:
>> String vpath = QueryUtil.escapeForAttributeSearch(path.toLowerCase());
>> String query =
>> "/jcr:root//element(*,nt:hierarchyNode)[fn:lower-case(jcr:content/@cms:virtualPath)
>> = '" + vpath + "']";
>> Query q = queryManager.createQuery(query, Query.XPATH);
>> NodeIterator ni = q.execute().getNodes();
>> if (ni.getSize() == 0) {
>>    throw new ItemNotFoundException("Unable to find item by virtual
>> path: " + path);
>> }
>> else if (ni.getSize() > 1) {
>>    throw new IllegalStateException("More than 1 item on virtual path: "
>> + path);
>> }
>> else {
>>    return ni.nextNode();
>> }
>> Our repository now contains around 500,000 virtual paths, more or less
>> divided over 150,000 files which are evenly distributed over more than
>> 1000 (nested) folders.
>> The repository runs on an Intel Nehalem Xeon (2 x 2.5GHz) running
>> Solaris 10 and the repository database (for datastore, filesystem, etc)
>> runs on the same specs, on a different server, running Oracle 10g.
>> When we try to add virtual paths in a batch (about 2000 virtual path
>> properties for 1000 files) and all virtual paths already exist (so the
>> above query returns 1 virtual path), we see a 100% load of the our
>> Tomcat application (which means 1 core fully utilized).
>> I would expect a JCR repository to be able to handle this kind of
>> queries. How are these properties indexed? Is it possible to optimize
>> the repository for this kind of queries? Or should I use a different
>> query? The alternative would be to keep a different database which keeps
>> track of the virtual paths, but keeping that in sync with the JCR
>> repository would be a pain, at the least.
>> Thanks for your ideas about this issue,
>> Kind regards,
>> Dennis van der Laan

Dennis van der Laan

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message