jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexandre Martins" <alexandremart...@gmail.com>
Subject Re: How to covert string to legal node name?
Date Mon, 26 Nov 2007 20:30:55 GMT
Maybe it can help you,

  /**
     * Creates a valid jcr label from the given one
     *
     * @param label
     * @return
     */
    private static String makeValidJCRPath(String label, boolean
appendLeadingSlash) {
        if (appendLeadingSlash && !label.startsWith("/")) {
            label = "/" + label;
        }
        StringBuffer ret = new StringBuffer(label.length());
        for (int i=0; i<label.length(); i++) {
            char c = label.charAt(i);
            if (c=='*' || c=='\'' || c=='\"') {
                c='_';
            /* not quite correct: [] may be the index of a previously
exported item. */
            } else if (c=='[') {
                c='(';
            } else if (c==']') {
                c=')';
            }
            ret.append(c);
        }
        return ret.toString();
    }

2007/11/26, Jukka Zitting <jukka.zitting@gmail.com>:
>
> Hi,
>
> On Nov 26, 2007 5:44 PM, Brian Thompson <elephantium@gmail.com> wrote:
> > In my application, I implemented a custom search/replace method to
> filter
> > out illegal characters.  It's pretty simple to write, so I didn't spend
> much
> > time looking for a library method to handle it.  AFAIK, the Jackrabbit
> API
> > doesn't address this issue.  I could be wrong, though (correct me if I'm
> > wrong, please, Jackrabbit devs!).
>
> There are two classed for this purpose in the jackrabbit-jcr-commons
> component:
>
> org.apache.jackrabbit.util.ISO9075 [1]
>
> This class implements the ISO9075 escaping mechanism that the JCR spec
> uses in the document view serialization format. All invalid name
> characters are converted to _xNNNN_ sequences, where NNNN is the
> hexadecimal representation of the Unicode code unit (UTF-16) of the
> character in question.
>
> This escaping format can look a bit surprising if you use the document
> view export feature, as the _x prefix ends up doubly escaped when
> exported to XML.
>
> org.apache.jackrabbit.util.Text [2]
>
> This class implements (among other things) a few variations of the URI
> escaping mechanism defined in RFC 2396. All invalid (as defined by the
> escaping method you choose) characters are converted to %NN sequences
> where NN is the hexadecimal representation of the Unicode code unit
> (UTF-8) of the character in question.
>
> This escaping format can look a bit surprising if you map node names
> or paths to URIs, as the % prefix ends up doubly escaped.
>
> [1]
> http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/ISO9075.html
> [2]
> http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/Text.html
>
> BR,
>
> Jukka Zitting
>



-- 
Alexandre Costa Martins
CESAR - Recife Center for Advanced Studies and Systems
Software Engineer and Software Reuse Researcher
MSc Candidate at Federal University of Pernambuco
RiSE Member - http://www.rise.com.br
Sun Certified Programmer for Java 5.0 (SCPJ5.0)

E-mail: alexandre.martins@cesar.org.br
MSN: xandecmartins@hotmail.com
GTalk: alexandremartins@gmail.com
Skype: xandecmartins
Mobile: +55 (81) 9929-9548
Office: +55 (81) 3425-4763
Fax: +55 (81) 3425-4701

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message