tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <>
Subject Re: Deadlock situation detected/avoided with jk_log_lock
Date Sat, 07 Feb 2009 17:08:53 GMT
On 06.02.2009 20:40, fredk2 wrote:
> Do I understand you correctly that when Mr. Orton said to never use pthread
> nor posixsem mutex ( that
> is now obsolete news and that Solaris perfected pthread mutex support since.

Joe Orton is always very careful with his statements, precise and 
correct. My personal experience with pthread mutexes on Solaris was 
fine, but I must confess, that I didn't do specialized tests to 
determine behaviour in crash situations.

I now did some searching and it turns out that the implementation of 
pthread mutexes for Solaris 10 has very recently changed quite a bit. So 
all speculations about improved pthread mutex behaviour (especially for 
"robust" mutexes) in the last years might have become obsolete.

The new implementation is contained in Solaris kernel patch 137137-09 
and most likely also in Solaris 10 Update 6 (10/08). I didn't check, 
whether that update simply contains the kernel patch or the fix is 
included independently.

Some detail is logged in Sunsolve under the bug IDs


> You mention that mod_jk uses pthread is that the same as the httpd itself?

mod_jk uses a global mutex provided by the apr libraries for access to 
the log file. It gets a default mutex, i.e. it lets APR decide, which 
type of mutex to use (APR_LOCK_DEFAULT, for Solaris it should be fcntl). 
You can't configure like for httpd's accept or ssl mutex.

mod_jk uses a couple of more locks, which are all not APR provided, but 
instead directly coded to use pthreads. All of those mutexes are only 
thread mutexes, so used locally in each process and not shared between 
processes. They won't have a problem with crashing processes.

They are:

- one mutex for each AJP worker, synchronizing access to the connection 
pool, which exists per process

- one mutex for each lb worker

- a mutex for access to the shared memory when changing or reading 
configuration parameters. That might be a little unsafe, because it 
actually should be a global mutex, not a process local, but those config 
changes are only done due to interaction with the status worker, so 
there's very little chance for unwanted concurrency here. All dynamic 
runtime data are already marked as being volatile.

- a mutex used during dynamic update of to 
prevent concurrent updates. Updates are done per process.

- a mutex to prevent concurrent execution of the process local internal 
maintenance task

> Some fellow at Covalent back in the early Apache 2.0 days, posted a white
> paper about his various mutex testing, but it does not appear to be
> available anymore. Would be interesting to know how it was tested and how it
> would playout today.

Lots of the Covalent people are still around in various projects, like 
William (Bill) A. Rowe and Jim Jagielski. You could post at apr-dev, 
because Apache httpd uses the mutex implementations coming from the APR 

> Rainer Jung-3 wrote:
>> On 06.02.2009 18:13, fredk2 wrote:
>>> I was doing some stress test (with apache ab, 100 users, 100K requests)
>>> to
>>> compare an Apache prefork and worker mpm.  The test url is a simple hello
>>> servlet on Tomcat 6.0.x via mod_jk. On my Sparc Solaris 10 server with
>>> only
>>> the Apache set to worker mpm I see following error messages in my jk log:
>>> Apache/2.2.11 (Unix) with mod_jk/1.2.26 on Solaris 10.
>>> . . .
>>> [Thu Jan 08 11:42:28 2009] [error] (45)Deadlock situation
>>> detected/avoided:
>>> apr_global_mutex_lock(jk_log_lock) failed
>>> . . .
>>> [Thu Jan 08 11:42:29 2009] [emerg] (45)Deadlock situation
>>> detected/avoided:
>>> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
>>> [Thu Jan 08 11:42:29 2009] [error] (45)Deadlock situation
>>> detected/avoided:
>>> apr_global_mutex_lock(jk_log_lock) failed
>>> . . .
>>> these errors do not appear to impact the test results and the jk log file
>>> seems complete.
>>> I can suppress the errors by choosing another Mutex in the Apache
>>> directive
>>> AcceptMutex, such as sysvsem or pthread.  For Solaris 10 the default
>>> mutex
>>> for worker MPM is fcntl.  Setting the Mutex sysvsem (also the default on
>>> Linux) marginally improves the request time.
>>> Can someone explain what exactly these errors means? when does it occur?
>>> I would have almost expect a "detected/avoided" to be a [warn] instead of
>>> an
>>> [error].
>>> I have seen the trail but
>>> I'd
>>> like to hear updated experiences that people have with sysvsem mutexes on
>>> Solaris 10 - what is the better mutex?  sysvsme, posixsem, pthread **?
>>> any comment will be appreciated.
>> I experienced this too a couple of times and once wrote a small C
>> program to reproduce the problem. On Solaris the algorithm to detect a
>> possible deadlock is very careful and returns EDEADLOCK even in
>> situations were you can mathematically prove, that a deadlock is not
>> possible. This happens in a multi-threaded environment when more than
>> one mutex is used.
>> Apache httpd and mod_jk use such a mutex and SSL also (so you can
>> observe the same warnings without mod_jk only using SSL with httpd and
>> doing stress tests).
>> In older JK versions this could lead to a hang, but we worked around
>> that a couple of versions ago. I generally recommend the pthread mutex
>> for Solaris which doesn't have the problem and seems to be robust
>> despite warnings about pthread mutexes in very old versions of Solaris.
>> We even once had a discussion about changing the default httpd mutex on
>> Solaris once, but I think that discussion didn't come to an end.
>> Regards,
>> Rainer

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message