Return-Path: X-Original-To: apmail-subversion-users-archive@minotaur.apache.org Delivered-To: apmail-subversion-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC41A96E2 for ; Thu, 2 Feb 2012 19:13:21 +0000 (UTC) Received: (qmail 24226 invoked by uid 500); 2 Feb 2012 19:13:20 -0000 Delivered-To: apmail-subversion-users-archive@subversion.apache.org Received: (qmail 24044 invoked by uid 500); 2 Feb 2012 19:13:20 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 24018 invoked by uid 99); 2 Feb 2012 19:13:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 19:13:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of igtorque.eliop@googlemail.com designates 209.85.210.43 as permitted sender) Received: from [209.85.210.43] (HELO mail-pz0-f43.google.com) (209.85.210.43) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 19:13:13 +0000 Received: by damc16 with SMTP id c16so2798880dam.16 for ; Thu, 02 Feb 2012 11:12:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=hF/41nOCgc6cEwHDZfeW/IizgWGxvf5K3NvM3+dCBak=; b=lnhwviwoX+DPhN7eR5TvWnpxYYTxz59mVsEqg2ck0O+q2HR39XxYx5+BplFNjV+uMt nBcxneuF/OiFnFXsaKRuGIrUr60l7fg23CZoKItl4rdt4muypQUa84aNTuET4JuCSpXd /iv7S77j2YrUC4gGUSCqIrFTr9IW11rlMgUp8= MIME-Version: 1.0 Received: by 10.68.72.70 with SMTP id b6mr10025729pbv.58.1328209972340; Thu, 02 Feb 2012 11:12:52 -0800 (PST) Received: by 10.68.17.194 with HTTP; Thu, 2 Feb 2012 11:12:52 -0800 (PST) In-Reply-To: <20120202093305.GA23501@ted.stsp.name> References: <20120202093305.GA23501@ted.stsp.name> Date: Thu, 2 Feb 2012 20:12:52 +0100 Message-ID: Subject: Re: check-mime-type, Windows client, non-ASCII path From: =?ISO-8859-1?Q?Ignacio_Gonz=E1lez_=28Eliop=29?= To: users@subversion.apache.org Cc: stsp@elego.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hello, Stefan. El 2 de febrero de 2012 10:33, Stefan Sperling escribi=F3: > On Wed, Feb 01, 2012 at 09:00:39AM +0100, Ignacio Gonz=E1lez (Eliop) wrot= e: > > Clients: Windows-XP, Windows 7, svn 1.6.16 (Spanish) > > Server: Linux (CentOS), svn 1.6.16 (Spanish) > > > > Repository created OK > > Hundreds of revisions already checked-in OK > > Hook "check-mime-type" (bash) added in server > > A couple of revisions checked-in OK > > New file added with non-ASCII characters -> Problem: > > Path name (in Windows, client): C:\Usuarios\arenero\In=FAtil.TXT > > (note the u with an acute accent: =FA) > > > > C:\Usuarios\arenero>svn ci acentos -m "Prueba 1" > > Adding =A0 =A0 =A0 =A0 acentos > > Adding =A0 =A0 =A0 =A0 acentos\In=A3til.TXT > > Transmitting file data .svn: Commit failed (details follow): > > svn: Commit blocked by pre-commit hook (exit code 1) with output: > > /opt/csvn/data/repositories/telecontrol/hooks/check-mime-type: > > `/opt/csvn/bin/sv > > nlook proplist /opt/csvn/data/repositories/arenero -t 44-1e --verbose > > acentos/In > > ?\195?\186til.TXT' failed with this output: > > svnlook: Path 'acentos/In?\195?\186til.TXT' does not exist > > 195 186 in hex is 0xc38a > > $ echo 0xc3ba | xxd -r | ExplicateUTF8 > The sequence 0xC3 =A0 =A0 0xBA > =A0 =A0 =A0 =A0 =A0 =A0 11000011 10111010 > is a valid UTF-8 character encoding equivalent to UTF32 0x000000FA. > > (ExplicateUTF8 is part of the 'unitools' suit). > > Written out as UTF-8 in email, unicode code point 0xfa is the character '= =FA'. Right. > > To help diagnose it, I tried to check out an already existing file with > > accents in its name > > (checked in before the Hook "check-mime-type" (bash) was added in the > > server). > > Check out fails. > > And how exactly does it fail? What's the error message? > Does it print the same error message as you get with the hook? > > Whenever you write a problem report and you describe parts of the > problem by "X fails" without showing how X fails, recipients of your > report can only make wild guesses. Agree, I forgot to detail this part. And I should really have been more careful! What I was trying to do is to checkout the file directly, instead of its parent directory. So: svn co http://localhost/svn/arenero/pru/=FAsame.TXT fails telling me that blah,blah was a file, not a directory, but svn co http://localhost/svn/arenero/pru/ succeeds. Stupid, stupid, stupid. > > Oh, my God. > > Don't panic. This is nothing that cannot be fixed. > You'll just have to figure out where it goes wrong. > > You didn't specify what type of server you are running (svnserve or > mod_dav_svn), so I'm going to guess that you're using mod_dav_svn, > i.e. an Apache HTTPD server is serving your repositories. I'm using httpd / mod_dav_svn, in fact, CollabNet Subversion Edge. > In that case, issue #2487 might be the problem: > http://subversion.tigris.org/issues/show_bug.cgi?id=3D2487 > Though this would not explain a failing checkout, only problems > in the hook script. Does your hook script set any of the LANG, LC_CTYPE > or LC_ALL environment variables to some value? (If possible, please just > show us the entire hook script.) Locale in this Linux server is: [csvn@svn tmp]$ locale LANG=3Des_ES.UTF-8 LC_CTYPE=3D"es_ES.UTF-8" LC_NUMERIC=3D"es_ES.UTF-8" LC_TIME=3D"es_ES.UTF-8" LC_COLLATE=3D"es_ES.UTF-8" LC_MONETARY=3D"es_ES.UTF-8" LC_MESSAGES=3D"es_ES.UTF-8" LC_PAPER=3D"es_ES.UTF-8" LC_NAME=3D"es_ES.UTF-8" LC_ADDRESS=3D"es_ES.UTF-8" LC_TELEPHONE=3D"es_ES.UTF-8" LC_MEASUREMENT=3D"es_ES.UTF-8" LC_IDENTIFICATION=3D"es_ES.UTF-8" LC_ALL=3D [csvn@svn tmp]$ Here's the hook script (note that I have to comment out the line with the check-mime-type invocation in order to check in new 'accented' files: [csvn@svn tmp]$ cat /opt/csvn/data/repositories/arenero/hooks/pre-commit #!/bin/sh # pre-commit # PRE-COMMIT HOOK REPOS=3D"$1" TXN=3D"$2" # Make sure that the log message contains some text. SVNLOOK=3D/opt/csvn/bin/svnlook $SVNLOOK log -t "$TXN" "$REPOS" | grep "[a-zA-Z0-9]" > /dev/null if [ $? -ne 0 ] then echo "*** Debe introducir un texto para ***" > /dev/stderr echo "*** describir los cambios realizados ***" > /dev/stderr exit 1 fi # Check that every added file has the svn:mime-type property set # and every added file with a mime-type matching text/* also has # svn:eol-style set #/opt/csvn/data/repositories/telecontrol/hooks/check-mime-type "$REPOS" "$TXN" || exit 1 # All checks passed, so allow the commit. exit 0 [csvn@svn tmp]$ And /opt/csvn/data/repositories/telecontrol/hooks/check-mime-type is: [csvn@svn tmp]$ cat /opt/csvn/data/repositories/telecontrol/hooks/check-mime-type #!/usr/bin/env perl # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # commit-mime-type-check.pl: check that every added file has the # svn:mime-type property set and every added file with a mime-type # matching text/* also has svn:eol-style set. If any file fails this # test the user is sent a verbose error message suggesting solutions and # the commit is aborted. # # Usage: commit-mime-type-check.pl REPOS TXN-NAME # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # Most of commit-mime-type-check.pl was taken from # commit-access-control.pl, Revision 9986, 2004-06-14 16:29:22 -0400. # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # Copyright (c) 2000-2004 CollabNet. All rights reserved. # # This software is licensed as described in the file COPYING, which # you should have received as part of this distribution. The terms # are also available at http://subversion.tigris.org/license.html. # If newer versions of this license are posted there, you may use a # newer version instead, at your option. # # This software consists of voluntary contributions made by many # individuals. For exact contribution history, see the revision # history and logs, available at http://subversion.tigris.org/. # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # Turn on warnings the best way depending on the Perl version. BEGIN { if ( $] >=3D 5.006_000) { require warnings; import warnings; } else { $^W =3D 1; } } use strict; use Carp; ###################################################################### # Configuration section. # Svnlook path. my $svnlook =3D "/opt/csvn/bin/svnlook"; # Since the path to svnlook depends upon the local installation # preferences, check that the required program exists to insure that # the administrator has set up the script properly. { my $ok =3D 1; foreach my $program ($svnlook) { if (-e $program) { unless (-x $program) { warn "$0: required program `$program' is not executable, ", "edit $0.\n"; $ok =3D 0; } } else { warn "$0: required program `$program' does not exist, edit $0.\n"= ; $ok =3D 0; } } exit 1 unless $ok; } ###################################################################### # Initial setup/command-line handling. &usage unless @ARGV =3D=3D 2; my $repos =3D shift; my $txn =3D shift; unless (-e $repos) { &usage("$0: repository directory `$repos' does not exist."); } unless (-d $repos) { &usage("$0: repository directory `$repos' is not a directory."); } # Define two constant subroutines to stand for read-only or read-write # access to the repository. sub ACCESS_READ_ONLY () { 'read-only' } sub ACCESS_READ_WRITE () { 'read-write' } ###################################################################### # Harvest data using svnlook. # Change into /tmp so that svnlook diff can create its .svnlook # directory. my $tmp_dir =3D '/tmp'; chdir($tmp_dir) or die "$0: cannot chdir `$tmp_dir': $!\n"; # Figure out what files have added using svnlook. my @files_added; foreach my $line (&read_from_process($svnlook, 'changed', $repos, '-t', $tx= n)) { # Add only files that were added to @files_added if ($line =3D~ /^A. (.*[^\/])$/) { push(@files_added, $1); } } my @errors; foreach my $path ( @files_added ) { my $mime_type; my $eol_style; # Parse the complete list of property values of the file $path to extract # the mime-type and eol-style foreach my $prop (&read_from_process($svnlook, 'proplist', $repos, '-t', $txn, '--verbose', $path)) { if ($prop =3D~ /^\s*svn:mime-type : (\S+)/) { $mime_type =3D $1; } elsif ($prop =3D~ /^\s*svn:eol-style : (\S+= )/) { $eol_style =3D $1; } } # Detect error conditions and add them to @errors if (not $mime_type) { push @errors, "$path : svn:mime-type is not set"; } elsif ($mime_type =3D~ /^text\// and not $eol_style) { push @errors, "$path : svn:mime-type=3D$mime_type but svn:eol-style is not set"; } } # If there are any errors list the problem files and give information # on how to avoid the problem. Hopefully people will set up auto-props # and will not see this verbose message more than once. if (@errors) { warn "$0:\n\n", join("\n", @errors), "\n\n", <&STDOUT") or die "$0: cannot dup STDOUT: $!\n"; exec(@_) or die "$0: cannot exec `@_': $!\n"; } my @output; while () { chomp; push(@output, $_); } close(SAFE_READ); my $result =3D $?; my $exit =3D $result >> 8; my $signal =3D $result & 127; my $cd =3D $result & 128 ? "with core dump" : ""; if ($signal or $cd) { warn "$0: pipe from `@_' failed $cd: exit=3D$exit signal=3D$signal\n"= ; } if (wantarray) { return ($result, @output); } else { return $result; } } sub read_from_process { unless (@_) { croak "$0: read_from_process passed no arguments.\n"; } my ($status, @output) =3D &safe_read_from_pipe(@_); if ($status) { if (@output) { die "$0: `@_' failed with this output:\n", join("\n", @output), "= \n"; } else { die "$0: `@_' failed with no output.\n"; } } else { return @output; } } [csvn@svn tmp]$ > See the issue link for more information and some workarounds (patches, > but also an additional apache module you could load). > A fix has just recently been committed but it is for 1.8. We cannot > backport it to 1.7 because it requires API changes. I will give it a try when I understand it :-) I hope to find some free time soon. > The character =FA is a character which has a diacritic so another > possible explanation is a problem with NFC/NFD normalisation. > See http://subversion.tigris.org/issues/show_bug.cgi?id=3D2464 > This usually happens when MacOS X clients are involved. But in theory any > Windows or Linux client could cause the same problem depening on how > tools used on the client machine normalise UTF-8. Ditto, I'll give it a try. > Can you check if either of these apply? > If not, we'll need to dig further. OK, I'll investigate further. Just to summarize, I have a problem and a no-problem: Problem: how to use the aforementioned check-mime-type with 'accented' file= s checked-in from Windows clients. No-problem: how to check out 'accented' files already in the repository with a Linux client. "Solved".