jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: JCR 1.0 SQL queries with JackRabbit
Date Mon, 17 Mar 2008 14:36:30 GMT
Hi Øyvind,

Øyvind Stegard wrote:
> 1) 
> I am having difficulties with some queries which are based on the
> jcr:path pseudo property:
> 
> Descendant or self (similar example found on page 305 of the JCR 1.0
> spec):
> SELECT * FROM nt:base WHERE jcr:path = '/foo' OR jcr:path LIKE '/foo/%'
> 
> Here I get the following error: Invalid combination of jcr:path clauses
> 
> The problem seems to be general, in that I am unable to combine multiple
> jcr:path clauses in a single query (with a few exceptions, see question
> 2). Even a simple expression like "jcr:path = '/foo' OR jcr:path =
> '/bar'" fails.

In JCR 1.0 only one basic path constraint per query is possible. It boils down to:
1) a node with a given path
2) nodes, which are children of a given path
3) nodes, which are descendants of a given path

> If this is a known limitation in JackRabbit 1.4, will such queries be
> possible in JCR 2.0, using the SQL2 syntax or QOM ? I would certainly
> prefer using a qom directly.

yes, this will be possible. In JCR 2.0 you can combine multiple path constraints.

> 2)
> Is there any feasible way of querying for path depth using the jcr:path
> pseudo property ? When talking about "path depth", I am only concerned
> about absolute node paths, where "/" has depth 0, "/foo" has depth 1 and
> so on. It is the same as the depth concept in
> org.apache.jackrabbit.spi.Path#getDepth(), but only for absolute paths. 
> 
> For instance, for a query that lists file/folder nodes at exactly depth
> 1, i.e. the children of the root node, one would think something like
> this works**:
> 
> SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
> jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%' AND NOT jcr:path
> LIKE '/%/%'
> 
> **nt:file and nt:folder only used as example, in reality we have our own
> custom types representing the file/folder concept.
> 
> And in fact, this particular query works as expected (even though it
> contains multiple jcr:path clauses). It is just like the "Child" SQL
> query example on page 305 of the JCR 1.0 spec.
> 
> Generalising this approach for depth-N, however, seems to not work, for
> instance for depth 2:
> 
> SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
> jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%/%' AND NOT jcr:path
> LIKE '/%/%/%'
> 
> This one returns a result set containing every nt:file/nt:folder-node in
> the repository with depth greater than 1 (we only want exactly 2). Is
> this kind of query possible at all when based on jcr:path ?

no, this is not possible with SQL but you can use XPath:

/jcr:root/*/*[@jcr:primaryType = 'nt:file' or @jcr:primaryType = 'nt:folder']

or you can use the common base type:

/jcr:root/*/element(*, nt:hierarchyNode)

> I could add an integer property containing the path depth for every node
> which represents a resource in our CMS, and use that property in
> queries. But that property value would be a direct function of jcr:path,
> and would need updating whenever jcr:path changes, that's something I do
> *not* want to do.
> 
> Supporting path depth clauses in queries might be considered somewhat
> eccentric, but it is something we currently support in our CMS with our
> own backend query implementation. Some general background information
> about our CMS is provided at the end of this email.
> 
> 
> 3)
> It seems that logical inversion of LIKE queries on jcr:path is ignored
> by JackRabbit.
> 
> Two examples:
> 
> SELECT * FROM nt:base WHERE jcr:path NOT LIKE '/foo/%'
> and
> SELECT * FROM nt:base WHERE NOT (jcr:path LIKE '/foo/%')
> 
> returns exactly the same result set as:
> 
> SELECT * FROM nt:base WHERE jcr:path LIKE '/foo/%'
> 
> So the "NOT" part seems to be completely ignored. I suspect this
> limitation is because of performance-concerns and the Lucene-based query
> backend ?

This is a bug in Jackrabbit and should actually throw an InvalidQueryException.

> 4)
> It seems that "NOT LIKE .." generally does not invert matching for any
> kind of JCR property ?

JCR 1.0 does not specify a <property-name> NOT LIKE <string-literal>, which 
means jackrabbit should probably also throw an InvalidQueryException in this case.

> However, I could work around that by wrapping a
> regular "LIKE" inside a NOT(...) expression, i.e. "NOT (foo:bar LIKE
> 'baz%')". Perhaps that's not very wise, performance-wise (pun
> intended) ?

yes, and it's also not exactly the same. While foo:bar not like 'baz%' returns 
all nodes with a foo:bar not starting with 'baz', not(foo:bar like 'baz%') will 
also return nodes that do not even have the property.

> Am I right in assuming that many of these limitations are present
> because of the relation to XPath, and mapping of SQL queries to an
> XPath-based query model internally in JackRabbit 1.4 ?

yes, kind of, though the actual limitations are specified in JSR 170.

> Some background information
> Our CMS uses a strictly hierarchical resource model, with
> file-resources, collection-resources and properties. It is implemented
> using the Spring framework and provides normal web access, in addition
> to access through the WebDAV protocol. At the bottom, we have our own
> repository/backend implementation which handles persistence of all
> resources and a custom Lucene-based search/query implementation. This is
> where JCR comes into the picture, we are experimenting with using JCR
> for both of these tasks (persistence and query). 
> 
> We have our own query language, parser and query object model. I am
> currently working on mapping this to equivalent JCR 1.0 SQL syntax for
> our experimental JCR-based backend. In our language, we have support for
> a few constructs which are based around URIs (which is the hierarchical
> identifier we use for all resources in our CMS). These URIs are directly
> mapped to corresponding JCR node paths, hence all the questions about
> queries on jcr:path :). Our own Lucene-based query implementation has
> special fields for optimizing the performance of queries which are based
> on the hierarchy of resources. 
> 
> What should we expect of support for these types of queries in future
> versions of JackRabbit, JCR 2.0, etc ?

The features that Jackrabbit will support in the future will be those defined in 
JSR 283. You can download the public preview here: 
http://jcp.org/aboutJava/communityprocess/pr/jsr283/

In addition there will be Jackrabbit proprietary extensions driven by the 
user/developer community. We did this already in the past when we saw a need for 
certain features that were not specified in JSR 170.

regards
  marcel

Mime
View raw message