jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charles Brooking" <public+jackrab...@charlie.brooking.id.au>
Subject Escaping/encoding of paths/names/values
Date Fri, 18 Sep 2009 06:16:08 GMT
Hi all,

In tackling the issue of escaping/encoding or paths, names, and values in
the context of JCR-based web application, I've discovered it's not so
simple. From my searching at least, there is little information online to
help, so I thought I'd write with my understanding so far and perhaps
others can chip in (most likely to correct me).

There are utility methods for escaping/encoding in the
org.apache.jackrabbit.util.ISO9075 and org.apache.jackrabbit.util.Text
classes. Although developed under Jackrabbit, they are part of the JCR
Commons module which only depends on the JCR API.

If you're building a path from user-supplied names, you need to escape
illegal JCR characters (eg item:1 becomes item%3A1):

  String path = "/foo/" + Text.escapeIllegalJcrChars(name);

Such paths are useful for JCR methods like Session.getItem(...) etc.
(Related to this: is there a utility to escape illegal JCR characters in
paths as opposed to just names?)

If you want to use paths in XPath queries, though, you need to escape
according to ISO9075 rules (eg 1hr0 becomes _x0031_hr0):

  String query =
    "/jcr:root" + ISO9075.encodePath(node.getPath()) +
    "/" + ISO9075.encode(name);

For a user-supplied string, this could lead to something like

For values inserted into the queries, you should do escaping to prevent
incorrect values and query injection. Generally, if you enclose values in
single quotes, you just need to replace any literal single quote character
with '' (two consecutive single quote characters). There is also a
Text.escapeIllegalXpathSearchChars(...) method you should use for calls to

  String q =
    "/jcr:root/foo/element(*, foo)" +
    "[jcr:contains(@title, '" +
    Text.escapeIllegalXpathSearchChars(q).replaceAll("'", "''") + "')]"
    "[@itemID = '" + itemID.replaceAll("'", "''") + "']";

There are further encoding/decoding methods in the Text class for dealing
with URIs in a webapp. And this is where I get really confused: the JCR
encoding scheme mimics percent-encoding used in URIs but is only said to
be "loosely modeled after URI encoding". What is the recommended approach
in converting between URI paths and their mapping to/from JCR paths?

Apologies if I've missed any existing online guides about this. Hopefully
we can make a nice page for the based on examples like the ones above.


View raw message