apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bert Huijben" <b...@qqmail.nl>
Subject [Patch] Reader Writer lock performance on Windows
Date Mon, 30 Jun 2014 17:34:02 GMT
	Hi,

I was profiling a subversion operation which actively used a recently added
memory cache that uses apr rwlocks. Somehow just these locks used more than
2.5% of the total processing time (which is mostly IO bound).
(For future reference: 'svn log file:///<RUBY>/trunk/ChangeLog' against a
packed local format 6 fsfs repository)

This made me look at the current implementation of rwlocks on Windows: a
mutex, combined with an event. Both quite heavy synchronization primitives.

Since Windows Vista, Microsoft provides a 'Slim Reader Writer lock'
implementation, which could just be used by apr instead of this old
implementation on all common Windows platforms. 
See
http://msdn.microsoft.com/en-us/library/windows/desktop/aa904937(v=vs.85).as
px

I wrote an initial implementation which might need some further cleanup (see
patch). The results of testlockperf.exe with the new code are quite
spectacular on my test VM. from roughly 10 times faster with a single thread
to > 100 times faster with 6 threads.

1 thread: 516999 usec vs 40000 usec (> 10*)
2 threads: 8932818 usec vs 78998 usec
3 threads: 16307486 usec vs 121002 usec
4 threads: 22326492 usec vs 159000 usec
5 threads: 27488411 usec vs 196000 usec
6 threads: 33191969 usec vs 237000 usec (> 140*)

One important difference between the legacy implementations and the new one
is that the new one will mostly be +- a spin lock the waiting thread, while
the old one just makes the process wait on the mutex, which is mostly like
suspend the process. 

So there might be some theoretic cases with very long locking times where
the old code would be preferred. But the caching logic where this code is
generally used would really benefit from switching.

[[
* include/arch/win32/apr_arch_misc.h
  (APR_DECLARE_LATE_DLL_FUNC_VOID): Declare APR_DECLARE_LATE_DLL_FUNC
variant with void return.

  (_RTL_SRWLOCK,
   RTL_SRWLOCK,
   PRTL_SRWLOCK,
    RTL_SRWLOCK_INIT,
   SRWLOCK,
   PSRWLOCK,
   SRWLOCK_INIT): Define like windows, for platforms that don't predefine.
  (InitializeSRWLock,
   AcquireSRWLockExclusive,
   AcquireSRWLockShared,
   ReleaseSRWLockExclusive,
   ReleaseSRWLockShared,
   TryAcquireSRWLockExclusive,
   TryAcquireSRWLockShared): Define when not defined by Windows.

* include/arch/win32/apr_arch_thread_rwlock.h
  (apr_thread_rwlock_t): Add union for slim reader writer value.

* locks/win32/thread_rwlock.c:
  (HAVE_NATIVE_SRW): Define when slim writers are (always) available.
  (apr_thread_rwlock_create,
   apr_thread_rwlock_rdlock,
   apr_thread_rwlock_tryrdlock,
   apr_thread_rwlock_wrlock,
   apr_thread_rwlock_trywrlock,
   apr_thread_rwlock_unlock,
   apr_thread_rwlock_destroy): Add slim writer implementation.
]]

Subversion's FSFS in memory cache greatly benefits from this patch, so I
would like to see this fix backported to future APR 1.5/1.6 versions.

	Bert


Mime
View raw message