httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Bannert <aa...@ebuilt.com>
Subject lock benchmarks on Solaris 8/sparc uniprocessor
Date Mon, 30 Jul 2001 20:42:07 GMT
Here are some benchmarks I performed on a Uniprocessor UltraSparc machine
running Solaris 8. The benchmarking code is the same that W. Richard
Stevens used in his UNIX Network Programming: Interprocess Communication,
Vol 2, Second Edition (See Appendix A, p. 463-466). I invite everyone
to perform these tests on their platforms in various configurations
(I *really* want to run these tests on a big 8-way sun box :)

Note that *nothing* in these tests will run faster in parallel, so
the single concurrency case will be optimal. This is good because it
means these tests maximally reflect the performance of the underlying
synchronization mechanisms, and are minimally scewed by the ability of
the machine to do the basic operation that we are serializing.

The numbers are all in seconds. Each test was performed 3 times and
averaged. The tests themselves consist of a number of concurrent workers
(threads or processes), each of which contends for a mutex.  Once the
mutex is acquired the active thread simply increments a counter and
unlocks. When the counter reaches 1 million, the process prints the time
delta and exits.

Multithreaded Results (aka PROCESS_PRIVATE)
-------------------------------------------------------------------------
Lock Mechanism            Concurrency      Total time (sec)
==============            ===========      ==========
pthread_mutex             1                0.4
pthread_mutex             2                0.7
pthread_mutex             3                1.1
pthread_mutex             4                1.5
pthread_mutex             5                1.8

pthread_rwlock            1                0.9
pthread_rwlock            2                1.9
pthread_rwlock            3                3.1
pthread_rwlock            4                4.5
pthread_rwlock            5                8.4

posix memory-based sem.   1                2.7
posix memory-based sem.   2                5.4
posix memory-based sem.   3                8.1
posix memory-based sem.   4                10.8
posix memory-based sem.   5                13.5

posix named sem.          1                7.5
posix named sem.          2                15.1
posix named sem.          3                22.7
posix named sem.          4                30.6
posix named sem.          5                38.5

SysV sem.                 1                4.0
SysV sem.                 2                8.6
SysV sem.                 3                12.5
SysV sem.                 4                16.5
SysV sem.                 5                21.0

SysV sem. w/ UNDO         1                4.7
SysV sem. w/ UNDO         2                9.5
SysV sem. w/ UNDO         3                14.5
SysV sem. w/ UNDO         4                19.1
SysV sem. w/ UNDO         5                23.8

fcntl()                   1                15.4
[thread concurrency greater than 1 on Solaris is not possible, since fcntl()
 can only lock between processes, not between threads in the same process.
 See below for the multiprocess fcntl() results.]


Multiprocess Results (aka PROCESS_SHARED)
-------------------------------------------------------------------------
Lock Mechanism            Concurrency      Total time (sec)
==============            ===========      ==========
pthread_mutex             1                0.4
pthread_mutex             2                0.8
pthread_mutex             3                1.1
pthread_mutex             4                1.4
pthread_mutex             5                1.8

pthread_rwlock            1                0.8
pthread_rwlock            2                1.5
pthread_rwlock            3                2.6
pthread_rwlock            4                4.3
pthread_rwlock            5                6.2

posix memory-based sem.   1                7.4
posix memory-based sem.   2                14.9
posix memory-based sem.   3                22.6
posix memory-based sem.   4                29.6
posix memory-based sem.   5                37.2

posix named sem.          1                7.7
posix named sem.          2                14.9
posix named sem.          3                22.4
posix named sem.          4                29.9
posix named sem.          5                37.4

SysV sem.                 1                4.1
SysV sem.                 2                8.4
SysV sem.                 3                12.0
SysV sem.                 4                16.1
SysV sem.                 5                20.3

SysV sem. w/ UNDO         1                5.0
SysV sem. w/ UNDO         2                9.8
SysV sem. w/ UNDO         3                14.4
SysV sem. w/ UNDO         4                19.3
SysV sem. w/ UNDO         5                23.7

fcntl()                   1                15.4
fcntl()                   2                40.6
fcntl()                   3                61.2
fcntl()                   4                89.0
fcntl()                   5                118.8
[Note: the lock file used here was in the /tmp directory. Lock files
 on a non-RAM based filesystem were significantly slower, and lock
 files on an NFS partition was even worse than that.]


Commentary:
---------------------
>>From the perspective of APR, choosing the correct underlying lock
mechanism can be very difficult. Trying to match a general-use
mutual exclusion mechanism to a particular platform with a particular
configuration may be too many variables to deal with at build-time (or
even run-time).  I'm not making any assertions here about which locking
mechanisms we should or should not be using, but I think we should gather
some more data and revisit this problem.

When we look at this merely from the perspective of solving the
accept() mutex problem in httpd, we have fewer variables to deal with
(CROSS_PROCESS vs.  LOCKALL), but the essence of the problem still
remains. The above results don't reflect other versions of Solaris,
nor do they reflect what happens on a parallel processor machine. My
hope is that this will give us something to chew on for awhile.

-aaron


Mime
View raw message