Return-Path: X-Original-To: apmail-subversion-commits-archive@minotaur.apache.org Delivered-To: apmail-subversion-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BF347184C1 for ; Fri, 19 Feb 2016 22:11:18 +0000 (UTC) Received: (qmail 71911 invoked by uid 500); 19 Feb 2016 22:11:18 -0000 Delivered-To: apmail-subversion-commits-archive@subversion.apache.org Received: (qmail 71873 invoked by uid 500); 19 Feb 2016 22:11:18 -0000 Mailing-List: contact commits-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@subversion.apache.org Delivered-To: mailing list commits@subversion.apache.org Received: (qmail 71862 invoked by uid 99); 19 Feb 2016 22:11:18 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Feb 2016 22:11:18 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 3C6BDC0CC3 for ; Fri, 19 Feb 2016 22:11:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.471 X-Spam-Level: * X-Spam-Status: No, score=1.471 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-0.329] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id V5qMc6tGfLNN for ; Fri, 19 Feb 2016 22:11:14 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8262D5F1E5 for ; Fri, 19 Feb 2016 22:11:13 +0000 (UTC) Received: from svn01-us-west.apache.org (svn.apache.org [10.41.0.6]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2A25FE00EA for ; Fri, 19 Feb 2016 22:11:12 +0000 (UTC) Received: from svn01-us-west.apache.org (localhost [127.0.0.1]) by svn01-us-west.apache.org (ASF Mail Server at svn01-us-west.apache.org) with ESMTP id 017573A0185 for ; Fri, 19 Feb 2016 22:11:11 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1731300 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_repos/dump.c libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c Date: Fri, 19 Feb 2016 22:11:11 -0000 To: commits@subversion.apache.org From: kotkov@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20160219221112.017573A0185@svn01-us-west.apache.org> Author: kotkov Date: Fri Feb 19 22:11:11 2016 New Revision: 1731300 URL: http://svn.apache.org/viewvc?rev=1731300&view=rev Log: Make svn log --search case-insensitive. Use utf8proc to do the normalization and locale-independent case folding (UTF8PROC_CASEFOLD) for both the search pattern and the input strings. Related discussion is in http://svn.haxx.se/dev/archive-2013-04/0374.shtml (Subject: "log --search test failures on trunk and 1.8.x"). * subversion/include/private/svn_utf_private.h (svn_utf__normalize): Add new boolean argument to perform case folding. * subversion/libsvn_subr/utf8proc.c (normalize_cstring): Add new boolean argument to perform case folding. In case it is non-zero, set the flags to UTF8PROC_CASEFOLD when doing the Unicode decomposition. (svn_utf__normalize): Pass new argument to normalize_cstring(). (svn_utf__is_normalized): Adjust call to normalize_cstring(). * subversion/libsvn_repos/dump.c (extract_mergeinfo_paths, verify_mergeinfo_normalization, check_name_collision): Update callers of svn_utf__normalize(). * subversion/svn/cl-log.h (): Include svn_string_private for svn_membuf_t. (svn_cl__log_receiver_baton): Add an svn_membuf_t for the case folding and normalization purposes. * subversion/svn/log-cmd.c (): Include svn_utf_private.h. (match): New helper that normalizes, folds case of the input, and matches it against the specified pattern. (match_search_pattern): Now accepts an svn_membuf_t. Call the new helper function to perform the pattern matching. (svn_cl__log_entry_receiver, svn_cl__log_entry_receiver_xml): Pass the svn_membuf_t from the baton when calling match_search_pattern(). (svn_cl__log): Initialize the svn_membuf_t in the log receiver baton. * subversion/svn/svn.c (): Include svn_utf_private.h. (sub_main): Normalize and fold case of --search and --search-and arguments. * subversion/tests/cmdline/log_tests.py (log_search): Adjust expectations, since --search is now case-insensitive. * subversion/tests/libsvn_subr/utf-test.c (test_utf_normalize): New test for svn_utf__normalize(). (test_funcs): Add new test. Modified: subversion/trunk/subversion/include/private/svn_utf_private.h subversion/trunk/subversion/libsvn_repos/dump.c subversion/trunk/subversion/libsvn_subr/utf8proc.c subversion/trunk/subversion/svn/cl-log.h subversion/trunk/subversion/svn/log-cmd.c subversion/trunk/subversion/svn/svn.c subversion/trunk/subversion/tests/cmdline/log_tests.py subversion/trunk/subversion/tests/libsvn_subr/utf-test.c Modified: subversion/trunk/subversion/include/private/svn_utf_private.h URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/include/private/svn_utf_private.h?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/include/private/svn_utf_private.h (original) +++ subversion/trunk/subversion/include/private/svn_utf_private.h Fri Feb 19 22:11:11 2016 @@ -139,6 +139,9 @@ svn_utf__normcmp(int *result, * null-terminated; otherwise, consider the string only up to the * given length. * + * If CASEFOLD is non-zero, perform Unicode case folding, e.g., for + * case-insensitive string comparison. + * * Return the normalized string in *RESULT, which shares storage with * BUF and is valid only until the next time BUF is modified. * @@ -148,6 +151,7 @@ svn_utf__normcmp(int *result, svn_error_t* svn_utf__normalize(const char **result, const char *str, apr_size_t len, + svn_boolean_t casefold, svn_membuf_t *buf); /* Check if STRING is a valid, NFC-normalized UTF-8 string. Note that Modified: subversion/trunk/subversion/libsvn_repos/dump.c URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_repos/dump.c?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/libsvn_repos/dump.c (original) +++ subversion/trunk/subversion/libsvn_repos/dump.c Fri Feb 19 22:11:11 2016 @@ -897,7 +897,7 @@ extract_mergeinfo_paths(void *baton, con if (xb->normalize) { const char *normkey; - SVN_ERR(svn_utf__normalize(&normkey, key, klen, &xb->buffer)); + SVN_ERR(svn_utf__normalize(&normkey, key, klen, FALSE, &xb->buffer)); svn_hash_sets(xb->result, apr_pstrdup(xb->buffer.pool, normkey), normalized_unique); @@ -951,7 +951,7 @@ verify_mergeinfo_normalization(void *bat const char *normpath; const char *found; - SVN_ERR(svn_utf__normalize(&normpath, path, klen, &vb->buffer)); + SVN_ERR(svn_utf__normalize(&normpath, path, klen, FALSE, &vb->buffer)); found = svn_hash_gets(vb->normalized_paths, normpath); if (!found) svn_hash_sets(vb->normalized_paths, @@ -2233,7 +2233,7 @@ check_name_collision(void *baton, const const char *name; const char *found; - SVN_ERR(svn_utf__normalize(&name, key, klen, &cb->buffer)); + SVN_ERR(svn_utf__normalize(&name, key, klen, FALSE, &cb->buffer)); found = svn_hash_gets(cb->normalized, name); if (!found) @@ -2252,7 +2252,7 @@ check_name_collision(void *baton, const SVN_ERR(svn_utf__normalize( &normpath, svn_relpath_join(db->path, name, iterpool), - SVN_UTF__UNKNOWN_LENGTH, &cb->buffer)); + SVN_UTF__UNKNOWN_LENGTH, FALSE, &cb->buffer)); notify_warning(iterpool, eb->notify_func, eb->notify_baton, svn_repos_notify_warning_name_collision, _("Duplicate representation of path '%s'"), normpath); Modified: subversion/trunk/subversion/libsvn_subr/utf8proc.c URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/utf8proc.c?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/libsvn_subr/utf8proc.c (original) +++ subversion/trunk/subversion/libsvn_subr/utf8proc.c Fri Feb 19 22:11:11 2016 @@ -126,15 +126,20 @@ decompose_normalized(apr_size_t *result_ * STRING. Upon return, BUFFER->data points at a NUL-terminated string * of UTF-8 characters. * + * If CASEFOLD is non-zero, perform Unicode case folding, e.g., for + * case-insensitive string comparison. + * * A returned error may indicate that STRING contains invalid UTF-8 or * invalid Unicode codepoints. Any error message comes from utf8proc. */ static svn_error_t * normalize_cstring(apr_size_t *result_length, const char *string, apr_size_t length, + svn_boolean_t casefold, svn_membuf_t *buffer) { - ssize_t result = unicode_decomposition(0, string, length, buffer); + ssize_t result = unicode_decomposition(casefold ? UTF8PROC_CASEFOLD : 0, + string, length, buffer); if (result >= 0) { svn_membuf__resize(buffer, result * sizeof(apr_int32_t) + 1); @@ -199,10 +204,11 @@ svn_utf__normcmp(int *result, svn_error_t* svn_utf__normalize(const char **result, const char *str, apr_size_t len, + svn_boolean_t casefold, svn_membuf_t *buf) { apr_size_t result_length; - SVN_ERR(normalize_cstring(&result_length, str, len, buf)); + SVN_ERR(normalize_cstring(&result_length, str, len, casefold, buf)); *result = (const char*)(buf->data); return SVN_NO_ERROR; } @@ -359,7 +365,7 @@ svn_utf__is_normalized(const char *strin apr_size_t result_length; const apr_size_t length = strlen(string); svn_membuf__create(&buffer, length * sizeof(apr_int32_t), scratch_pool); - err = normalize_cstring(&result_length, string, length, &buffer); + err = normalize_cstring(&result_length, string, length, FALSE, &buffer); if (err) { svn_error_clear(err); Modified: subversion/trunk/subversion/svn/cl-log.h URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/cl-log.h?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/svn/cl-log.h (original) +++ subversion/trunk/subversion/svn/cl-log.h Fri Feb 19 22:11:11 2016 @@ -31,6 +31,8 @@ #include "svn_types.h" +#include "private/svn_string_private.h" + #ifdef __cplusplus extern "C" { #endif /* __cplusplus */ @@ -70,6 +72,9 @@ typedef struct svn_cl__log_receiver_bato * the log message, or a changed path matches one of these patterns. */ apr_array_header_t *search_patterns; + /* Buffer for Unicode normalization and case folding. */ + svn_membuf_t buffer; + /* Pool for persistent allocations. */ apr_pool_t *pool; } svn_cl__log_receiver_baton; Modified: subversion/trunk/subversion/svn/log-cmd.c URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/log-cmd.c?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/svn/log-cmd.c (original) +++ subversion/trunk/subversion/svn/log-cmd.c Fri Feb 19 22:11:11 2016 @@ -38,6 +38,7 @@ #include "private/svn_cmdline_private.h" #include "private/svn_sorts_private.h" +#include "private/svn_utf_private.h" #include "cl.h" #include "cl-log.h" @@ -110,6 +111,24 @@ display_diff(const svn_log_entry_t *log_ return SVN_NO_ERROR; } +/* Return TRUE if STR matches PATTERN. Else, return FALSE. Assumes that + * PATTERN is a UTF-8 string normalized to form C with case folding + * applied. Use BUF for temporary allocations. */ +static svn_boolean_t +match(const char *pattern, const char *str, svn_membuf_t *buf) +{ + svn_error_t *err; + + err = svn_utf__normalize(&str, str, strlen(str), TRUE, buf); + if (err) + { + /* Can't match invalid data. */ + svn_error_clear(err); + return FALSE; + } + + return apr_fnmatch(pattern, str, 0) == APR_SUCCESS; +} /* Return TRUE if SEARCH_PATTERN matches the AUTHOR, DATE, LOG_MESSAGE, * or a path in the set of keys of the CHANGED_PATHS hash. Else, return FALSE. @@ -120,22 +139,22 @@ match_search_pattern(const char *search_ const char *date, const char *log_message, apr_hash_t *changed_paths, + svn_membuf_t *buf, apr_pool_t *pool) { /* Match any substring containing the pattern, like UNIX 'grep' does. */ const char *pattern = apr_psprintf(pool, "*%s*", search_pattern); - int flags = 0; /* Does the author match the search pattern? */ - if (author && apr_fnmatch(pattern, author, flags) == APR_SUCCESS) + if (author && match(pattern, author, buf)) return TRUE; /* Does the date the search pattern? */ - if (date && apr_fnmatch(pattern, date, flags) == APR_SUCCESS) + if (date && match(pattern, date, buf)) return TRUE; /* Does the log message the search pattern? */ - if (log_message && apr_fnmatch(pattern, log_message, flags) == APR_SUCCESS) + if (log_message && match(pattern, log_message, buf)) return TRUE; if (changed_paths) @@ -150,15 +169,14 @@ match_search_pattern(const char *search_ const char *path = apr_hash_this_key(hi); svn_log_changed_path2_t *log_item; - if (apr_fnmatch(pattern, path, flags) == APR_SUCCESS) + if (match(pattern, path, buf)) return TRUE; /* Match copy-from paths, too. */ log_item = apr_hash_this_val(hi); if (log_item->copyfrom_path && SVN_IS_VALID_REVNUM(log_item->copyfrom_rev) - && apr_fnmatch(pattern, - log_item->copyfrom_path, flags) == APR_SUCCESS) + && match(pattern, log_item->copyfrom_path, buf)) return TRUE; } } @@ -168,13 +186,14 @@ match_search_pattern(const char *search_ /* Match all search patterns in SEARCH_PATTERNS against AUTHOR, DATE, MESSAGE, * and CHANGED_PATHS. Return TRUE if any pattern matches, else FALSE. - * SCRACH_POOL is used for temporary allocations. */ + * BUF and SCRATCH_POOL are used for temporary allocations. */ static svn_boolean_t match_search_patterns(apr_array_header_t *search_patterns, const char *author, const char *date, const char *message, apr_hash_t *changed_paths, + svn_membuf_t *buf, apr_pool_t *scratch_pool) { int i; @@ -197,7 +216,7 @@ match_search_patterns(apr_array_header_t pattern = APR_ARRAY_IDX(pattern_group, j, const char *); match = match_search_pattern(pattern, author, date, message, - changed_paths, iterpool); + changed_paths, buf, iterpool); if (!match) break; } @@ -331,7 +350,7 @@ svn_cl__log_entry_receiver(void *baton, if (lb->search_patterns && ! match_search_patterns(lb->search_patterns, author, date, message, - log_entry->changed_paths2, pool)) + log_entry->changed_paths2, &lb->buffer, pool)) { if (log_entry->has_children) { @@ -535,7 +554,7 @@ svn_cl__log_entry_receiver_xml(void *bat /* Match search pattern before XML-escaping. */ if (lb->search_patterns && ! match_search_patterns(lb->search_patterns, author, date, message, - log_entry->changed_paths2, pool)) + log_entry->changed_paths2, &lb->buffer, pool)) { if (log_entry->has_children) { @@ -795,6 +814,7 @@ svn_cl__log(apr_getopt_t *os, lb.diff_extensions = opt_state->extensions; lb.merge_stack = NULL; lb.search_patterns = opt_state->search_patterns; + svn_membuf__create(&lb.buffer, 0, pool); lb.pool = pool; if (opt_state->xml) Modified: subversion/trunk/subversion/svn/svn.c URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/svn.c?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/svn/svn.c (original) +++ subversion/trunk/subversion/svn/svn.c Fri Feb 19 22:11:11 2016 @@ -56,6 +56,7 @@ #include "private/svn_opt_private.h" #include "private/svn_cmdline_private.h" #include "private/svn_subr_private.h" +#include "private/svn_utf_private.h" #include "svn_private_config.h" @@ -1868,6 +1869,7 @@ sub_main(int *exit_code, int argc, const svn_boolean_t reading_file_from_stdin = FALSE; apr_hash_t *changelists; apr_hash_t *cfg_hash; + svn_membuf_t buf; received_opts = apr_array_make(pool, SVN_OPT_MAX_OPTIONS, sizeof(int)); @@ -1888,6 +1890,9 @@ sub_main(int *exit_code, int argc, const /* Init our changelists hash. */ changelists = apr_hash_make(pool); + /* Init the temporary buffer. */ + svn_membuf__create(&buf, 0, pool); + /* Begin processing arguments. */ opt_state.start_revision.kind = svn_opt_revision_unspecified; opt_state.end_revision.kind = svn_opt_revision_unspecified; @@ -2392,11 +2397,19 @@ sub_main(int *exit_code, int argc, const break; case opt_search: SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool)); - add_search_pattern_group(&opt_state, utf8_opt_arg, pool); + SVN_ERR(svn_utf__normalize(&utf8_opt_arg, utf8_opt_arg, + strlen(utf8_opt_arg), TRUE, &buf)); + add_search_pattern_group(&opt_state, + apr_pstrdup(pool, utf8_opt_arg), + pool); break; case opt_search_and: SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool)); - add_search_pattern_to_latest_group(&opt_state, utf8_opt_arg, pool); + SVN_ERR(svn_utf__normalize(&utf8_opt_arg, utf8_opt_arg, + strlen(utf8_opt_arg), TRUE, &buf)); + add_search_pattern_to_latest_group(&opt_state, + apr_pstrdup(pool, utf8_opt_arg), + pool); case opt_remove_unversioned: opt_state.remove_unversioned = TRUE; break; Modified: subversion/trunk/subversion/tests/cmdline/log_tests.py URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/cmdline/log_tests.py?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/tests/cmdline/log_tests.py (original) +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Fri Feb 19 22:11:11 2016 @@ -2288,13 +2288,13 @@ def log_search(sbox): log_chain = parse_log_output(output) check_log_chain(log_chain, [7, 6, 3]) - # search is case-sensitive + # search is case-insensitive exit_code, output, err = svntest.actions.run_and_verify_svn( None, [], 'log', '--search', 'FOR REVISION [367]') log_chain = parse_log_output(output) - check_log_chain(log_chain, []) + check_log_chain(log_chain, [7, 6, 3]) # multi-pattern search exit_code, output, err = svntest.actions.run_and_verify_svn( Modified: subversion/trunk/subversion/tests/libsvn_subr/utf-test.c URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/libsvn_subr/utf-test.c?rev=1731300&r1=1731299&r2=1731300&view=diff ============================================================================== --- subversion/trunk/subversion/tests/libsvn_subr/utf-test.c (original) +++ subversion/trunk/subversion/tests/libsvn_subr/utf-test.c Fri Feb 19 22:11:11 2016 @@ -823,6 +823,97 @@ test_utf_conversions(apr_pool_t *pool) } +static svn_error_t * +test_utf_normalize(apr_pool_t *pool) +{ + /* Normalized: NFC */ + static const char nfc[] = + "\xe1\xb9\xa8" /* S with dot above and below */ + "\xc5\xaf" /* u with ring */ + "\xe1\xb8\x87" /* b with macron below */ + "\xe1\xb9\xbd" /* v with tilde */ + "\xe1\xb8\x9d" /* e with breve and cedilla */ + "\xc8\x91" /* r with double grave */ + "\xc5\xa1" /* s with caron */ + "\xe1\xb8\xaf" /* i with diaeresis and acute */ + "\xe1\xbb\x9d" /* o with grave and hook */ + "\xe1\xb9\x8b"; /* n with circumflex below */ + + /* Normalized: NFC, case folded */ + static const char nfc_casefold[] = + "\xe1\xb9\xa9" /* s with dot above and below */ + "\xc5\xaf" /* u with ring */ + "\xe1\xb8\x87" /* b with macron below */ + "\xe1\xb9\xbd" /* v with tilde */ + "\xe1\xb8\x9d" /* e with breve and cedilla */ + "\xc8\x91" /* r with double grave */ + "\xc5\xa1" /* s with caron */ + "\xe1\xb8\xaf" /* i with diaeresis and acute */ + "\xe1\xbb\x9d" /* o with grave and hook */ + "\xe1\xb9\x8b"; /* n with circumflex below */ + + /* Normalized: NFD */ + static const char nfd[] = + "S\xcc\xa3\xcc\x87" /* S with dot above and below */ + "u\xcc\x8a" /* u with ring */ + "b\xcc\xb1" /* b with macron below */ + "v\xcc\x83" /* v with tilde */ + "e\xcc\xa7\xcc\x86" /* e with breve and cedilla */ + "r\xcc\x8f" /* r with double grave */ + "s\xcc\x8c" /* s with caron */ + "i\xcc\x88\xcc\x81" /* i with diaeresis and acute */ + "o\xcc\x9b\xcc\x80" /* o with grave and hook */ + "n\xcc\xad"; /* n with circumflex below */ + + /* Mixed, denormalized */ + static const char mixup[] = + "S\xcc\x87\xcc\xa3" /* S with dot above and below */ + "\xc5\xaf" /* u with ring */ + "b\xcc\xb1" /* b with macron below */ + "\xe1\xb9\xbd" /* v with tilde */ + "e\xcc\xa7\xcc\x86" /* e with breve and cedilla */ + "\xc8\x91" /* r with double grave */ + "s\xcc\x8c" /* s with caron */ + "\xe1\xb8\xaf" /* i with diaeresis and acute */ + "o\xcc\x80\xcc\x9b" /* o with grave and hook */ + "\xe1\xb9\x8b"; /* n with circumflex below */ + + /* Invalid UTF-8 */ + static const char invalid[] = + "\xe1\xb9\xa8" /* S with dot above and below */ + "\xc5\xaf" /* u with ring */ + "\xe1\xb8\x87" /* b with macron below */ + "\xe1\xb9\xbd" /* v with tilde */ + "\xe1\xb8\x9d" /* e with breve and cedilla */ + "\xc8\x91" /* r with double grave */ + "\xc5\xa1" /* s with caron */ + "\xe1\xb8\xaf" /* i with diaeresis and acute */ + "\xe6" /* Invalid byte */ + "\xe1\xb9\x8b"; /* n with circumflex below */ + + const char *result; + svn_membuf_t buf; + + svn_membuf__create(&buf, 0, pool); + SVN_ERR(svn_utf__normalize(&result, nfd, strlen(nfd), FALSE, &buf)); + SVN_TEST_STRING_ASSERT(result, nfc); + SVN_ERR(svn_utf__normalize(&result, nfd, strlen(nfd), TRUE, &buf)); + SVN_TEST_STRING_ASSERT(result, nfc_casefold); + SVN_ERR(svn_utf__normalize(&result, mixup, strlen(mixup), FALSE, &buf)); + SVN_TEST_STRING_ASSERT(result, nfc); + SVN_ERR(svn_utf__normalize(&result, mixup, strlen(mixup), TRUE, &buf)); + SVN_TEST_STRING_ASSERT(result, nfc_casefold); + + SVN_TEST_ASSERT_ERROR(svn_utf__normalize(&result, invalid, strlen(invalid), + FALSE, &buf), + SVN_ERR_UTF8PROC_ERROR); + SVN_TEST_ASSERT_ERROR(svn_utf__normalize(&result, invalid, strlen(invalid), + TRUE, &buf), + SVN_ERR_UTF8PROC_ERROR); + + return SVN_NO_ERROR; +} + /* The test table. */ @@ -849,6 +940,8 @@ static struct svn_test_descriptor_t test "test svn_utf__is_normalized"), SVN_TEST_PASS2(test_utf_conversions, "test svn_utf__utf{16,32}_to_utf8"), + SVN_TEST_PASS2(test_utf_normalize, + "test svn_utf__normalize"), SVN_TEST_NULL };