subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Foad <julianf...@btopenworld.com>
Subject Test suite doesn't detect httpd crashes
Date Wed, 17 Dec 2014 12:56:06 GMT
I have found that Apache 2.4.7 with the 'event' MPM sometimes crashes while running our test
suite. I discussed with Philip and Ben yesterday and they let me know that this a known issue
with several 2.4.x versions, and is fixed in later versions (>= 2.4.10 ?). The Subversion
I'm currently testing with is trunk@1646184.

The problem I want to highlight here is that our test suite doesn't report these crashes.
On some runs it reports that some individual tests failed, and overall failure, but on other
runs it reports no individual tests failed, and overall success, even though Apache crashed.

Do we want to make our test suite detect these crashes, and report overall failure even if
each of the individual test scenarios completed successfully?

It seems to me we should. What do you think?


DETAILS

After a run of externals_tests.py --parallel that reported success on all tests, the error_log
first contains this:

[mpm_event:error] ... AH00484: server reached MaxRequestWorkers setting, consider raising
the MaxRequestWorkers setting

which doesn't seem to be a problem in itself, as I always get this even if it doesn't go on
to crash, and also get a similar message when using 'worker' MPM and that doesn't go on to
crash either.

Then it may contain one or more messages like this:

[core:notice] ... AH00051: child pid 23555 exit signal Segmentation fault (11)...

At the moment I don't know if any error indication is being returned to some part of the test
runner and then not reported as test failure, or if searching the error_log is the only way
to detect these crashes.

Here is how I run the tests:

$ (cd obj-dir/ && rm -rf subversion/tests/cmdline/httpd-20141217-* && APACHE_MPM=event
$SVN_SRC/subversion/tests/cmdline/davautocheck.sh externals --cleanup --parallel; echo "Test
suite returned: $?"; grep -i "Segmentation fault\|mpm_.*error" subversion/tests/cmdline/httpd-20141217-*/error_log)

Here is the entire error_log for one run, after stripping out 'authz_' messages:

[[[
[Wed Dec 17 12:11:39.605935 2014] [mpm_event:debug] [pid 9284:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
[Wed Dec 17 12:11:39.606003 2014] [mpm_event:notice] [pid 9283:tid 140516954560384] AH00489:
Apache/2.4.7 (Ubuntu) SVN/1.9.0-dev configured -- resuming normal operations
[Wed Dec 17 12:11:39.606057 2014] [mpm_event:info] [pid 9283:tid 140516954560384] AH00490:
Server built: Jul 22 2014 14:36:38
[Wed Dec 17 12:11:39.606065 2014] [core:notice] [pid 9283:tid 140516954560384] AH00094: Command
line: '/usr/sbin/apache2 -f /.../subversion/tests/cmdline/httpd-20141217-121139/cfg'
[Wed Dec 17 12:11:39.606080 2014] [mpm_event:debug] [pid 9285:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
[Wed Dec 17 12:11:39.606877 2014] [mpm_event:debug] [pid 9287:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
[Wed Dec 17 12:11:39.607031 2014] [mpm_event:debug] [pid 9289:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
[Wed Dec 17 12:11:40.607116 2014] [mpm_event:error] [pid 9283:tid 140516954560384] AH00484:
server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Wed Dec 17 12:11:43.610414 2014] [core:notice] [pid 9283:tid 140516954560384] AH00051: child
pid 9287 exit signal Segmentation fault (11), possible coredump in /.../subversion/tests/cmdline/httpd-20141217-121139
[Wed Dec 17 12:11:44.611355 2014] [core:notice] [pid 9283:tid 140516954560384] AH00051: child
pid 9289 exit signal Segmentation fault (11), possible coredump in /.../subversion/tests/cmdline/httpd-20141217-121139
[Wed Dec 17 12:11:44.611517 2014] [mpm_event:debug] [pid 13516:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
[Wed Dec 17 12:11:45.614336 2014] [mpm_event:debug] [pid 15082:tid 140516876347136] event.c(1949):
AH02471: start_threads: Using epoll
]]]

Occasionally the operating system's crash report dialogue has popped up on one of these runs,
but in most runs it doesn't. I don't know why.

FWIW, when I enabled core dumping (ulimit -c 100000) and loaded a core dump into GDB I saw:

(gdb) bt
 #0  process_socket (my_thread_num=0, ...) at event.c:1064
 #1  worker_thread (thd=<optimised out>, dummy=<optimised out>) at event.c:1815
 #2  0x00007fb76889f182 in start_thread (arg=0x7fb75ffff700) at pthread_create.c:312
 #3  0x00007fb7685cbfbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

That was not the same test run that I reported above, and I haven't checked if the back-trace
is similar every time.

- Julian

Mime
View raw message