jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@day.com>
Subject Re: Unicode, NFC,NFD and node names
Date Mon, 09 Nov 2009 10:54:21 GMT
2009/11/6 Grégory Joseph <gregory.joseph@magnolia-cms.com>:
> Map a webdav folder to OSX's finder, create a node with umlauts, it will be
> created with the NFD form.
> (java.text.Normalizer.isNormalized() to see that, or String.getBytes())
> Map the same folder using Linux or Windows, I'm pretty sure the files will
> be created using the NFC form.
> TBH, I still have to try that;

An explicit failure case would be good, as I think nobody has seen
this issue (yet) with Jackrabbit.

The only occurrence of this different normalization issue was with
certain filenames (containing "special" characters) in SVN that was
used both on Windows and Mac. But that was using the standard C-based
SVN client. I think with Java the UTF-8 support is better.

> Still, I have no control under what
> form a node is created. This could mean (to be verified) that in the case of
> a node type that does not allow same-name siblings, one could actually
> create two nodes with an "apparent" same name.

I think (feel free to correct me here) that under Java both strings
should be equal(), regardless of their normalization when serialized
and stored onto disk.

> Encoding URLs properly is probably going to solve most of my problems; I've
> been looking at patching this, but it would seem indeed pretty contrived and
> requiring quite some code on our side to just change the type of
> PathResolver to use, for instance (starting from
> org.apache.jackrabbit.core.jndi.RegistryHelper and all the way down to
> javax.jcr.Repository#login. Could this maybe be something that would its
> place in the WorkspaceConfig ?

I think would be an advanced setting, since the JCR compliance is
based on a PathResolver working according to the spec, and people
should not be easily allowed to "break" Jackrabbit this way.

Rather, if this is really an issue, it should simply be fixed in
Jackrabbit (PathResolver or where else the String might need to be


Alexander Klimetschek

View raw message