httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Newman <Tim.New...@Calabrio.com>
Subject [users@httpd] Apache child process crashing frequently
Date Wed, 24 Jan 2018 14:37:48 GMT
We have four Windows Servers running Apache 2.4.27 acting as load balancers for our application
server cluster, which is running Tomcat. Recently, we have started to experience a high number
of crashes with the web servers. Within the Apache error logs we see the following:

[Mon Jan 15 15:12:08.271099 2018] [mpm_winnt:notice] [pid 1696:tid 432] AH00428: Parent: child
process 38240 exited with status 3221225477 -- Restarting.
[Mon Jan 15 15:12:08.944108 2018] [mpm_winnt:notice] [pid 1696:tid 432] AH00455: Apache/2.4.27
(Win64) OpenSSL/1.0.2l configured -- resuming normal operations
[Mon Jan 15 15:12:08.944108 2018] [mpm_winnt:notice] [pid 1696:tid 432] AH00456: Apache Lounge
VC11 Server built: Jul 10 2017 14:15:02
[Mon Jan 15 15:12:08.957110 2018] [mpm_winnt:notice] [pid 1696:tid 432] AH00418: Parent: Created
child process 43540

Between the four web servers, we often see over a dozen such crashes a day - sometimes more,
sometimes less. In some cases Apache will crash after the child process was restarted only
5 minutes before. The number of crashes goes down significantly during the night and weekends,
but it still happens. As far as we can tell, we have not made any major changes to the configuration
recently and have only started to experience this in the past few weeks.

We were able to get a core dump from one of the web servers as it was crashing. The following
is seem pieces extracted from it:
FAULTING_IP:
libaprutil_1!apr_brigade_writev+37a
00000000`6f8f21da 488908          mov     qword ptr [rax],rcx

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 000000006f8f21da (libaprutil_1!apr_brigade_writev+0x000000000000037a)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000001
   Parameter[1]: 0000000000000000
Attempt to write to address 0000000000000000

STACK_TEXT:
libaprutil_1!apr_brigade_writev+0x37a
libapr_1!apr_pool_destroy+0x6e
libaprutil_1!apr_brigade_cleanup+0x43
mod_ssl!ssl_run_init_server+0x2ddf
mod_ssl!ssl_run_init_server+0x1cf5
libhttpd!ap_process_request_after_handler+0x5c
libhttpd!ap_process_request+0x17
libhttpd!ap_sys_privileges_handlers+0x3953
libhttpd!ap_run_process_connection+0x35
libhttpd!ap_process_connection+0x45
libhttpd!ap_regkey_value_set+0x21f3
kernel32!BaseThreadInitThunk+0x22
ntdll!RtlUserThreadStart+0x34

Looking at the Windows Event Viewer, we see modules "libaprutil-1" and "libapr-1" as the faulting
modules when the crashes occur. One some rarer occasions, we will see "ntdll" and "libhttpd"
as the faulting modules.

We have tried increasing the thread stack size (based on similar reports online) but that
has not helped. We've enabled forensic logging, trying to determine if there was some sort
of rogue request that could be knocking us over, but nothing seemed really out of place.

Is there anything we can do to determine what the root cause is?

Thanks
-Tim

Mime
View raw message