jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grégory Joseph <gregory.jos...@magnolia-cms.com>
Subject Re: Unicode, NFC,NFD and node names
Date Wed, 04 Nov 2009 19:36:44 GMT
fwiw, the following solves the simple problem shown by my previous  
example:

     private Session wrap(final SessionImpl origSession) throws  
RepositoryException {
         final WorkspaceImpl workspace = (WorkspaceImpl)  
origSession.getWorkspace();
         final RepositoryImpl rep = (RepositoryImpl)  
origSession.getRepository();
         return new SessionImpl(rep, origSession.getSubject(),  
workspace.getConfig()) {
             public Path getQPath(String path) throws  
MalformedPathException, IllegalNameException, NamespaceException {
		// this is the only relevant part:
                 return super.getQPath(Normalizer.normalize(path,  
Normalizer.Form.NFC));
             }
         };
     }

If there was a way to swap the session implementation or the Name-and/ 
or-PathResolver implementations that are used by default, I might give  
this a spin.

Any opinions about the whole problem?

Cheers,

-g

On Nov 4, 2009, at 6:11 PM, Grégory Joseph wrote:

> Hi list,
>
> Given the following code,
> import java.text.Normalizer;
> ...
>
>        final Session session = ...
>
>        final Repository rep = session.getRepository();
>        System.out.println(rep.getDescriptor("jcr.repository.name") +  
> " " + rep.getDescriptor("jcr.repository.version"));
>
>        final Node root = session.getRootNode();
>        final String name = "föö";
>        System.out.println("Normalizer.isNormalized(name,  
> Normalizer.Form.NFC) = " + Normalizer.isNormalized(name,  
> Normalizer.Form.NFC)); // true
>        System.out.println("Normalizer.isNormalized(name,  
> Normalizer.Form.NFD) = " + Normalizer.isNormalized(name,  
> Normalizer.Form.NFD)); // false
>        root.addNode(name);
>        session.save();
>
>        final Node node1 = root.getNode(name);
>        System.out.println("node1 = " + node1);
>        final Node node2 = root.getNode(Normalizer.normalize(name,  
> Normalizer.Form.NFC));
>        System.out.println("node2 = " + node2);
>        final Node node3 = root.getNode(Normalizer.normalize(name,  
> Normalizer.Form.NFD)); // fails
>        System.out.println("node3 = " + node3);
>
> There's a good chance fetching node3 won't work. It might be  
> dependent on the underlying os and database, but in the case of OSX  
> and Derby, this fails. It's not that surprising, really, given that  
> Normalizer.normalize(name,  
> Normalizer.Form.NFC).equals(Normalizer.normalize(name,  
> Normalizer.Form.NFD)) is NOT true.
>
> Now, taking into account the fact that all sorts of clients will use  
> a different Normalizing Form (Firefox seems to encode URL parameters  
> with NFD, Safari with NFC; linux NFC, OSX finder seems to favor  
> NFD), wouldn't it be a safe bet to normalize all input at repository  
> level ? Or do you consider this is something client applications  
> should do ?
>
> ref: http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms
>
> Thanks for any tip, pointer, idea, feedback or reaction !
>
> Cheers,
>
> -greg
>
>



Mime
View raw message