jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis van der Laan <d.g.van.der.l...@rug.nl>
Subject Searching for a property
Date Thu, 03 Dec 2009 08:47:11 GMT

It seems querying on a property is very slow on our system (running
Jackrabbit 1.6.0): almost 1 second per query which would normally return
0 or 1 result.

We use jcr:content nodes of type nt:unstructured to store the contents
of a file (in the jcr:data property) and we store an array of Strings in
a property cms:virtualPath on the same node. So basically, every file in
our repository has the JCR path and zero or more virtual paths in the
cms:virtualPath property. If we want to add a virtual path to a file, we
have to check if the virtual path does not exist already. For this, we
use an XPath query in the following code snippet:

String vpath = QueryUtil.escapeForAttributeSearch(path.toLowerCase());
String query =
= '" + vpath + "']";
Query q = queryManager.createQuery(query, Query.XPATH);
NodeIterator ni = q.execute().getNodes();
if (ni.getSize() == 0) {
    throw new ItemNotFoundException("Unable to find item by virtual
path: " + path);
else if (ni.getSize() > 1) {
    throw new IllegalStateException("More than 1 item on virtual path: "
+ path);
else {
    return ni.nextNode();

Our repository now contains around 500,000 virtual paths, more or less
divided over 150,000 files which are evenly distributed over more than
1000 (nested) folders.

The repository runs on an Intel Nehalem Xeon (2 x 2.5GHz) running
Solaris 10 and the repository database (for datastore, filesystem, etc)
runs on the same specs, on a different server, running Oracle 10g.

When we try to add virtual paths in a batch (about 2000 virtual path
properties for 1000 files) and all virtual paths already exist (so the
above query returns 1 virtual path), we see a 100% load of the our
Tomcat application (which means 1 core fully utilized).

I would expect a JCR repository to be able to handle this kind of
queries. How are these properties indexed? Is it possible to optimize
the repository for this kind of queries? Or should I use a different
query? The alternative would be to keep a different database which keeps
track of the virtual paths, but keeping that in sync with the JCR
repository would be a pain, at the least.

Thanks for your ideas about this issue,
Kind regards,
Dennis van der Laan

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message