apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe Jr." <wr...@rowe-clan.net>
Subject Re: apr_fnmatch deltas
Date Tue, 03 May 2011 19:29:28 GMT
On 5/3/2011 12:12 PM, Jeff Trawick wrote:
> On Tue, May 3, 2011 at 1:20 AM, William A. Rowe Jr. <wrowe@rowe-clan.net> wrote:
>> On 5/2/2011 6:24 PM, Jeff Trawick wrote:
>>> BTW, checked performance against previous version?
>>
>> For 1:1 testing of the patterns that exist in test/testfnmatch.c,
>> 100,000 iterations here on my box, 8626494 usec for the new vs.
>> 3674210 usec for the previous.
> 
> that proportion is in the ballpark of what I got with a different set
> of testcases
> 
> but they were contrived, the before/after builds weren't optimized, etc.

I was at Visual Studio 6, /O2 level, clean stack handling (not optimized).
I'll create a nice test utility timing fnmatch against old apr_fnmatch or
system fnmatch, for people to experiment with, once I'm caught up a bit.
It's amusing that your unoptimized build is near identical, I'm looking
forward to measuring against a modern compiler :)

>> This seems consistent with the retests of null, /, \/ in pattern
>> which are repeated all to frequently; this should be able to be
>> simplified with the use of a couple cautiously placed gotos and
>> some code coverage analysis (necessary to ensure we hit the entire
>> pattern space in our test patterns).  Also the fact that there is
>> so little recursion in the test cases, and so few calls to match
>> very common "*.txt" style patterns, which the new code should beat
>> the old hands down as trailing length is worked out, and we are
>> not redundantly testing failed pattern spaces.

My test code allows pattern string flag... I'm thinking the util
should just consume @patternfile @stringfile n-n range to allow
the user to abuse and clock some real patterns (find / > biglist).

>> Two things would radically affect the speed, which I am unsure
>> offhand what msvc did 10 yrs ago, inline expansion of **pattern
>> to &*pattern (it is an alias reference which should not be double
>> de-referenced), and aliasing all of the setup (slash in fnmatch_ch
>> is slash in fnmatch).  If the compiler is smart, there should be
>> no penalty for inlining the code as a function; if it is not, we
>> may want to consider duplicating [at least some of] the code (it
>> is invoked in two places, one for counting the trailing pattern
>> following a wildcard, one for performing the match).

I detest the idea of duplicating, we open potential de-synchronization
of the pattern walking logic from the pattern matching logic once we've
duplicated the code :(  I expect we will need to do this based on your
comments above.  We might also be able to collapse one level of looping
(and might not) based on judicious use of goto.  The depth of this logic
is an illustration of why "gotos considered harmful" can sometimes be
a destructive philosophy :)


Mime
View raw message