subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bert Huijben" <b...@qqmail.nl>
Subject RE: svn commit: r1731300 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_repos/dump.c libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c
Date Sat, 20 Feb 2016 11:09:02 GMT


> -----Original Message-----
> From: kotkov@apache.org [mailto:kotkov@apache.org]
> Sent: vrijdag 19 februari 2016 23:11
> To: commits@subversion.apache.org
> Subject: svn commit: r1731300 - in /subversion/trunk/subversion:
> include/private/svn_utf_private.h libsvn_repos/dump.c
> libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c
> tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c
> 
> Author: kotkov
> Date: Fri Feb 19 22:11:11 2016
> New Revision: 1731300
> 
> URL: http://svn.apache.org/viewvc?rev=1731300&view=rev
> Log:
> Make svn log --search case-insensitive.
> 
> Use utf8proc to do the normalization and locale-independent case folding
> (UTF8PROC_CASEFOLD) for both the search pattern and the input strings.
> 
> Related discussion is in http://svn.haxx.se/dev/archive-2013-04/0374.shtml
> (Subject: "log --search test failures on trunk and 1.8.x").
> 
> * subversion/include/private/svn_utf_private.h
>   (svn_utf__normalize): Add new boolean argument to perform case folding.

Usually it is far more efficient to perform the comparison on the unnormalized strings using
the apis, than to normalize and perform the operation later.  I'm not sure if utf8proc supports
this feature though

But I'm wondering why you added this feature to an existing function?

I don't think it is recommended practice to perform the normalization this way and adding
a boolean to an existing function makes it easier to do perform things in a not recommended
way.



Locale independent case folding is not that well defined... Things like the Turkish 'i' that
doesn't fold, so any decision on that makes it locale dependent. (n this case probably by
choosing not Turkish, but that doesn't make it 'locale independent'.

Just folding the western European characters is much easier to explain/document.

	Bert 


Mime
View raw message