httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe JAILLET <>
Subject Re: apr_token_* conclusions (was: Better casecmpstr[n]?)
Date Wed, 25 Nov 2015 22:07:43 GMT
Le 25/11/2015 22:02, Jim Jagielski a écrit :
> In general, strcmp() is not implemented via strcmp.c
> (although if you do a source code search for strcmp, that's
> what you'll get). Most of the time it's implemented in
> assembly (strcmp.s) or simply leverages memcmp() where
> you aren't doing a byte by byte comparison but are doing
> a native memory word (32 or 64bit) comparison. This
> makes them super fast.
> Once we need to worry about case insensitivity, then
> we see a whole gamut of implementations; some use
> a mapped array as I did; some go char by char and call
> tolower() on each one; some do other things such as
> testing if isupper() before calling tolower() if needed.
> The word-based optimizations seem less viable, as seen
> in test results that I ran and Yann also verified (afaict)
> In my tests, my impl was faster on OSX and CentOS5 and 6.
> It's a very common function we use and with my test results
> it seemed to make sense to provide our own impl, esp if
> we decided that what we were really concerned about was
> comparing for equality, and so would be able to avoid
> the !strcasecmp logic leaping.
> If we decide that all this was for moot, that's fine.
> That's what these types of investigations and discussions
> are for.

Personally, my testing shows that faster/slower is not that self 
evident. On my machine, it depends of the length of the string.
With shorter strings (less than ~10 chars) Yann's proposal seems to be 
the best with the test program. What happens if the const char table is 
not in L1 cache? We still have the same speedup?
When strings are longer, std strncasecmp always win.

Short strings are our use case, so, I would say, why not using this 
implementation, after all?

My personal reticence would be:
    - it adds complexity to the code (one more function that looks 
really similar to existing ones)
    - the speed increase is 'only' 15% if I remember well latest numbers 
given by Yann
    - the speed increase is potentially platform/compiler/C library 
    - it does not suppress (IMO) the 'switch' for going even faster to 
the right test
    - many off the tests against ASCII strings are hidden in apr 
functions (apr_table_get...)
Do we have an idea of the overall time spent in these str[n]casecmp 
function when processing a request?  15% of that time should be, IMO, 
quite low.
Does it worse the added complexity? For me, the answer is: not sure.


View raw message