httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <rainer.j...@kippdata.de>
Subject Re: 2.3.12-beta....
Date Tue, 10 May 2011 20:57:58 GMT
On 10.05.2011 22:36, Rainer Jung wrote:
> On 10.05.2011 22:03, Rainer Jung wrote:
>> On 10.05.2011 20:57, Jim Jagielski wrote:
>>>
>>> On May 10, 2011, at 2:46 PM, Rainer Jung wrote:
>>>
>>>> On 10.05.2011 14:30, Jim Jagielski wrote:
>>>>> Once Jeff applies his hook-probes patch, I'll be doing the
>>>>> T&R within the next few hours.
>>>>>
>>>>> On May 9, 2011, at 3:18 PM, Jim Jagielski wrote:
>>>>>
>>>>>> I plan on doing a T&R tomorrow...
>>>>
>>>> I notice strange trunk failures on my Solaris 10 system. The failures
>>>> were already happening before the probe changes. The Perl script
>>>> RewriteMap process crashes shortly after the fork. In truss I can see
>>>> it closing file descriptors after the fork and then it crashes before
>>>> calling exec or similar. So something around apr_proc_create() seems
>>>> to go wrong, or possibly the apr_procattr are not write.
>>>>
>>>> It doesn't happen on Solaris 8, so it is possible my system is
>>>> borked. It also doesn't happen for 2.2.x.
>>>>
>>>> I'll try to investigate further, but if there is no immediate idea
>>>> about that I'm fine with rolling the beta, because it is not clear,
>>>> whether I have available enough time right now to debug.
>>>
>>> Do the APR tests run cleanly?
>>
>> Unfortunately yes, at least most of the time. The proc tests never
>> failed. I added debug output to apr_proc_create(), the crash happens in
>>
>> apr_pool_cleanup_for_exec();
>>
>> Digging further shows, the crash happens in running the child cleanups
>> for the pconf pool (in the 9th cleanup). Maybe it it related to the
>> testreslist failures, because some of them happen in
>> apr_pool_cleanup_kill. Just a wild speculation.
>>
>> I will try to stop the process before the crash and investigate with the
>> debugger. Unfortunately the core if written doesn't seem usable.
>
> child_cleanup_fn is NULL in a cleanup, that has plain_cleanup_fn equals
> to apr_ldap_pool_cleanup_set_null. Getting closer. At least it is not
> unplausible, because my builds for Solaris 8 and 10 differ by the exact
> LDAP behavior.
>
> Maybe related to log line
>
> [Tue May 10 22:33:08.626119 2011] [ldap:info] [pid 25137] LDAP: SSL
> support unavailable: LDAP: ldapssl_client_init() failed.
>
> maybe not ...
>
> Investigating further.

At least one reason in apr-util: File ldap/apr_ldap_rebind.c contains:

/* APR utility routine used to create the xref_lock. */
APU_DECLARE_LDAP(apr_status_t) apr_ldap_rebind_init(apr_pool_t *pool)
{
     apr_status_t retcode = APR_SUCCESS;

#ifdef NETWARE
     get_apd
#endif

     /* run after apr_thread_mutex_create cleanup */
     apr_pool_cleanup_register(pool, &apr_ldap_xref_lock, 
apr_ldap_pool_cleanup_set_null, NULL);

#if APR_HAS_THREADS
     if (apr_ldap_xref_lock == NULL) {
         retcode = apr_thread_mutex_create(&apr_ldap_xref_lock, 
APR_THREAD_MUTEX_DEFAULT, pool);
     }
#endif

     return(retcode);
}


The call

apr_pool_cleanup_register(pool, &apr_ldap_xref_lock, 
apr_ldap_pool_cleanup_set_null, NULL);

registers a child cleanup function NULL, which will always crash. 
because the functions are called unconditionally in apr_pool.

I will check all apr_pool_cleanup_register() in apr, apr-util and httpd 
for similar occurences...

Regards,

Rainer

Mime
View raw message