subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Martin <>
Subject Re: svnlook proplist & unicode characters
Date Mon, 15 Dec 2014 13:59:29 GMT
"Matthias Ludwig" <> writes:

> I try to call Svnlook proplist within a svn hook on windows.
> Svnlook proplist <repo-path> <pathToFile>
> The <pathToFile> contains unicode only characters (unicode combinining characters).
> The unicode characters are not passed correctly to svnlook.
> I googled around and found that one should that the code page with chcp. This changes
the stdout-encoding of svnlook for the output. But I did not succeed to change the interpretation
oft he calling parameter.
> The caller is a java routine. I tried Runtime.getRuntime().exe() and native calls via
> I do not exactly know where the problem is. Does the call mess up the
> unicode characters? Or is svnlook not capable of processing unicode
> characters in input paremeters?

svnlook should handle unicode characters in parameters.  However
Subversion has no special support for combining characters and just uses
whatever literal UTF-8 sequence is supplied.  That means the composed
and decomposed forms are different paths in the repository: e.g š
encoded as 's' + 'U+030C' is not the same path as š encoded as 'U+0161'

$ svnadmin create repo
$ svnmucc -mm -U file://`pwd`/repo mkdir `printf "s\u030c"` propset p v `printf "s\u030c"`
$ svnlook tree repo
$ svnlook proplist repo `printf "s\u030c"`
Properties on '/š':
$ svnlook proplist repo `printf "u\0161"`
svnlook: E160013: Path '/š' does not exist

All Subversion utilities do conversion between UTF-8 and whatever local
encoding is in use.  If your local encoding is not UTF-8 then the
conversion to UTF-8 will probably generate either the composed or
decomposed form and it can be difficult to generate the other form, you
may have to switch your local encoding to UTF-8 and generate it
yourself.  I have no idea what that involves on Windows.

See also which
is about choosing a canonical representation.

Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

View raw message