Return-Path: X-Original-To: apmail-subversion-users-archive@minotaur.apache.org Delivered-To: apmail-subversion-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E0C209D8E for ; Mon, 15 Dec 2014 14:00:44 +0000 (UTC) Received: (qmail 3554 invoked by uid 500); 15 Dec 2014 14:00:44 -0000 Delivered-To: apmail-subversion-users-archive@subversion.apache.org Received: (qmail 3524 invoked by uid 500); 15 Dec 2014 14:00:44 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 3513 invoked by uid 99); 15 Dec 2014 14:00:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Dec 2014 14:00:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of philip.martin@wandisco.com designates 74.125.82.54 as permitted sender) Received: from [74.125.82.54] (HELO mail-wg0-f54.google.com) (74.125.82.54) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Dec 2014 14:00:17 +0000 Received: by mail-wg0-f54.google.com with SMTP id l2so14738709wgh.27 for ; Mon, 15 Dec 2014 05:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wandisco.com; s=gapps; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type:content-transfer-encoding; bh=w2Rdok7TrIZP/tR/oKyXX/Xjp97/Gp60VYS77ws7H/c=; b=TOxhiMO1gy9uzvsNpolxf0SCc9benikFQXOTolV0lmumqSckRkcYWYPkGdMbuduBt4 fDNDR2FQhMdnnURyOafz3uucHb+p58B/JEzr2RDElJ9h2YzAN8fz5IPZlLoNEkFctWLM 64DPH/pGYgczrnZB3yLqkfRKo/1yU2VN6iqMM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-type :content-transfer-encoding; bh=w2Rdok7TrIZP/tR/oKyXX/Xjp97/Gp60VYS77ws7H/c=; b=aeQuCctbhONQa6J4IPPDxxggT1v92VudU3XjRP9G8mausKQqJmfg97dzlIrx9DzPJ9 DnEmvVFL9yDhaQf91CCzAW5QCjhfz9V9hzfE3a30ke7cpGRzGt/OPWOjQyIbydgqgx3I lx/TsGUL7DXXhBEvlRL8+m+briMuHcoqRTLSdHrxVu2DeGQG3oKsivuvErI0lTjYbotC W1AHMHTi3nJxaxXOw2nc/pvbdom2xCobhTIBHEvbg7/URTH2tJv3IB8ZHKC9IQ02NroM kxJtUod9O3A7G43OjEBfAR1/bnvAQZB4+geqNyqafYn3Fe0sCy3gFHpugfcMnEKX4TaQ MPSA== X-Gm-Message-State: ALoCoQllM5aqcMedExlTgNNLc/e4q/QG4MhtOdGwuspO33fsLCx2ThnqTaNHzbRUGFrReKEVqEba X-Received: by 10.194.85.137 with SMTP id h9mr42359986wjz.70.1418651971084; Mon, 15 Dec 2014 05:59:31 -0800 (PST) Received: from localhost (cpc20-farn7-2-0-cust13.6-2.cable.virginm.net. [86.15.228.14]) by mx.google.com with ESMTPSA id ud1sm13164880wjc.7.2014.12.15.05.59.30 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 15 Dec 2014 05:59:30 -0800 (PST) From: Philip Martin To: "Matthias Ludwig" Cc: Subject: Re: svnlook proplist & unicode characters References: <013201d017d0$5a5ee7f0$0f1cb7d0$@stl-software.de> Date: Mon, 15 Dec 2014 13:59:29 +0000 In-Reply-To: <013201d017d0$5a5ee7f0$0f1cb7d0$@stl-software.de> (Matthias Ludwig's message of "Sun, 14 Dec 2014 20:01:19 +0100") Message-ID: <87ppblui7y.fsf@ntlworld.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org "Matthias Ludwig" writes: > I try to call Svnlook proplist within a svn hook on windows. > > Svnlook proplist > > The contains unicode only characters (unicode combinining ch= aracters). > > The unicode characters are not passed correctly to svnlook. > > I googled around and found that one should that the code page with chcp. = This changes the stdout-encoding of svnlook for the output. But I did not s= ucceed to change the interpretation oft he calling parameter. > > The caller is a java routine. I tried Runtime.getRuntime().exe() and nati= ve calls via jna. > > I do not exactly know where the problem is. Does the call mess up the > unicode characters? Or is svnlook not capable of processing unicode > characters in input paremeters? svnlook should handle unicode characters in parameters. However Subversion has no special support for combining characters and just uses whatever literal UTF-8 sequence is supplied. That means the composed and decomposed forms are different paths in the repository: e.g s=CC=8C encoded as 's' + 'U+030C' is not the same path as =C5=A1 encoded as 'U+0161' $ svnadmin create repo $ svnmucc -mm -U file://`pwd`/repo mkdir `printf "s\u030c"` propset p v `pr= intf "s\u030c"` $ svnlook tree repo / s=CC=8C/ $ svnlook proplist repo `printf "s\u030c"` Properties on '/s=CC=8C': p $ svnlook proplist repo `printf "u\0161"` svnlook: E160013: Path '/=C5=A1' does not exist All Subversion utilities do conversion between UTF-8 and whatever local encoding is in use. If your local encoding is not UTF-8 then the conversion to UTF-8 will probably generate either the composed or decomposed form and it can be difficult to generate the other form, you may have to switch your local encoding to UTF-8 and generate it yourself. I have no idea what that involves on Windows. See also http://subversion.tigris.org/issues/show_bug.cgi?id=3D2464 which is about choosing a canonical representation. --=20 Philip Martin | Subversion Committer WANdisco // *Non-Stop Data*