httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dean gaudet <>
Subject Re: [PATCH 2] Re: [PATCH] performance patch for mod_log_config
Date Mon, 10 Sep 2001 22:09:24 GMT
On Mon, 10 Sep 2001, Brian Pane wrote:

> dean gaudet wrote:
> > why is there a need for 15 entries?  if it's a multiprocess server then
> > there's only a need for 1 or 2 entries.  if it's a multithreaded server
> > then you need to lock the cache (which you're not doing :)
> The whole point of the design is to do the multithreaded cache
> without locks.

oh i see.  i think you should carefully comment this then, 'cause this
is really subtle... the fact that you copy into the cache from distinct
apr_exploded_time_t * is what makes it safe.  (i.e. take a look at the
win32 time.c code and notice there's a few ++ operations which would
not be multithreaded safe.)

but still, does this really work?  'cause the time used is
r->request_time, if a request takes longer than 14 seconds or so to
generate/serve then the logger is going to be asking for timestamps which
are dangerously close to your cache rollover.   and since you can't
update a cache entry atomically it's possible they'll see a partially
updated entry, and get bogus results.

> > isn't the real win in eliminating both the divisions required by the
> > explode time functions and the divisions required by the printf?
> That's what I used to think, but the profile data showed that localtime()
> (which does most of the work in apr_explode_localtime()) is an order of
> magnitude more expensive than the apr_snprintf() that formats the resulting
> data.  Caching or otherwise optimizing the printf in the future wouldn't
> hurt, but it's a more marginal win than caching the exploded time struct.
> > also -- you could eliminate the seconds = t / APR_USECS_PER_SEC
> > calculation (death to all divisions!  especially the 64-bit ones)...
> > because this division already happens inside the explode function:
> >
> > when you miss the cache and do a full explode, then subtract the
> > resulting
> > tm_usec off the input time and you've got an apr_time_t which lands
> > exactly on a second boundary.  use that as your cache key, and you can
> > then do ranged comparisons (t >= cache->t && t < cache->t +
> The one problem I can see with this is that removing the division means
> that you have to scan through the table to find out whether you have a
> cache hit or not--which probably negates the performance benefit of
> removing the division.

hehe, try this:

% cat >t.c <<EOF
long long r(long long t)
	return t / 1000000LL;

int main(int argc, char **argv)
	return 0;
% gcc -O2 -g t.c
% gdb a.out
(gdb) disass r
(gdb) disass __divdi3

(__divdi3 is what you should see a call to in r ... assuming you're on
a 32-bit box.)  64-bit division is a really big out of line subroutine :)

that's why strength reductions like i was suggesting are a good idea.

i hadn't looked at your code close enough to see that you didn't already
scan the cache ... cache scanning isn't SMP safe either...

btw, TIME_CACHE_SIZE should be a power of 2 so that the cache hash can
be strength reduced to a binary-and :)


View raw message