apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <traw...@gmail.com>
Subject Re: apr_fnmatch deltas
Date Tue, 03 May 2011 17:12:15 GMT
On Tue, May 3, 2011 at 1:20 AM, William A. Rowe Jr. <wrowe@rowe-clan.net> wrote:
> On 5/2/2011 6:24 PM, Jeff Trawick wrote:
>> BTW, checked performance against previous version?
> For 1:1 testing of the patterns that exist in test/testfnmatch.c,
> 100,000 iterations here on my box, 8626494 usec for the new vs.
> 3674210 usec for the previous.

that proportion is in the ballpark of what I got with a different set
of testcases

but they were contrived, the before/after builds weren't optimized, etc.

maybe I'll have time to play more after 1.4.3

> This seems consistent with the retests of null, /, \/ in pattern
> which are repeated all to frequently; this should be able to be
> simplified with the use of a couple cautiously placed gotos and
> some code coverage analysis (necessary to ensure we hit the entire
> pattern space in our test patterns).  Also the fact that there is
> so little recursion in the test cases, and so few calls to match
> very common "*.txt" style patterns, which the new code should beat
> the old hands down as trailing length is worked out, and we are
> not redundantly testing failed pattern spaces.
> The code is designed to move fairly painlessly to MBCS encodings.
> This incurs a design penalty in speed, although not nearly as bad
> as it will be once mbr*() calls are used in place of bytewise
> comparisons to advance characters, particularly for case insensitive
> comparison.
> Two things would radically affect the speed, which I am unsure
> offhand what msvc did 10 yrs ago, inline expansion of **pattern
> to &*pattern (it is an alias reference which should not be double
> de-referenced), and aliasing all of the setup (slash in fnmatch_ch
> is slash in fnmatch).  If the compiler is smart, there should be
> no penalty for inlining the code as a function; if it is not, we
> may want to consider duplicating [at least some of] the code (it
> is invoked in two places, one for counting the trailing pattern
> following a wildcard, one for performing the match).
> I'm rather more curious how it compares to real world BSD and Linux
> implementations today using SBCS "C" locale... as committed, as well
> as integrating fnmatch_ch() into the mainline code.  I will pull out
> the test patterns and test them against the BSD port (posted in
> p.a.o/~wrowe/ space) when some free time returns, I have some work-
> work I have to complete first.
> If anyone wants to spend cycles, the best place to start is to make
> a test suite list that corresponds to the real world, and clocks it
> in cpu spin instead of world clock time.  This also needs various
> flavors of range tests and false ranges (for which the old code
> disagreed with the modern fnmatch implementations).
> In any case, this closes a real world flaw, so I'd have no issue
> going with it for now as a correctness release once reviewers are
> satisfied, with more optimization work in the near term for the next
> release.


View raw message