httpd-bugs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 29709] New: - Error in pool management on multiprocessor envoirment
Date Mon, 21 Jun 2004 09:11:30 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=29709>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29709

Error in pool management on multiprocessor envoirment

           Summary: Error in pool management on multiprocessor envoirment
           Product: APR
           Version: 0.9.0
          Platform: PC
        OS/Version: Windows NT/2K
            Status: NEW
          Severity: Critical
          Priority: Other
         Component: APR
        AssignedTo: bugs@httpd.apache.org
        ReportedBy: kalkuhl@boehme-weihs.de


Hello !
It seems there is a bug in the memory management of apache withhin a 
multiprocessor envoirment.
First I will introduce my envoirment:
I use apache 2.0.47 in default configuration (build with the workspace shipped 
by apache.org).
The bug was reproduced on Windows NT4 with dual Processor, W2K with dual 
Processor and W2K with Hyperthreading processor.
For reproducing the bug I modified the mod_example as followed:
In x_handler() I insert a Sleep(1000); for simulating a long operation during 
request like this:
--------------------------------------
ap_rprintf(r, "  Apache HTTP Server version: \"%s\"\n", ap_get_server_version
());
ap_rputs("  <BR>\n", r);

/* Simulating a long operation */
Sleep(1000);

ap_rprintf(r, "  Server built: \"%s\"\n", ap_get_server_built());
--------------------------------------
Then I made heavy request to normal URL’s and to mod_example simultaneous.

After a while apache will fail at random in two different states. The first 
state is a access violation. The Log comes as follows

[Thu Jun 17 16:45:47 2004] [notice] Parent: child process exited with status 
3221225477 -- Restarting.
[Thu Jun 17 16:45:50 2004] [notice] Parent: Created child process 3464
[Thu Jun 17 16:45:51 2004] [debug] mpm_winnt.c(505): Parent: Sent the 
scoreboard to the child
[Thu Jun 17 16:45:53 2004] [notice] Child 3464: Child process is running
[Thu Jun 17 16:45:53 2004] [info] Parent: Duplicating socket 404 and sending it 
to child process 3464
[Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(426): Child 3464: Retrieved our 
scoreboard from the parent.
[Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(623): Parent: Sent 1 listeners 
to child 3464
[Thu Jun 17 16:45:53 2004] [debug] mpm_winnt.c(582): Child 3464: retrieved 1 
listeners from parent
[Thu Jun 17 16:45:53 2004] [notice] Child 3464: Acquired the start mutex.
[Thu Jun 17 16:45:54 2004] [notice] Child 3464: Starting 25 worker threads.

Where status is 0xC0000005 which means Access Violation.

The access violation happens in apr_pool_walk_tree() while accessing
child = pool->child;

At the same time another thread tries to free the pool memory.

I have no stack backtrace for this at the moment but I can try to make one, if 
there is a intrest for it.

The other failure is that apache stops responding and stays with a processor 
load of about 50%.
A break into code shows, that the pool is damaged.
One thread stays in allocator_free() and tries to free pool memory but the
next pointer of the actual node points to itself so there is a recursion where
apache never gets out.
This happens at this point:

do {
        next = node->next;
        index = node->index;

Where the node has the following content:

next = 0x007dadd8
node->index = 1
node->next = 0x007dadd8
node->ref = 0x007dadd8
node->free_index = 3452816845

The Thread backtrace looks like this:

allocator_free(apr_allocator_t * 0x00773dd8, apr_memnode_t * 0x007dadd8) line 
362 + 6 bytes
apr_pool_destroy(apr_pool_t * 0x007d8db8) line 797 + 13 bytes
trace_add(server_rec * 0x0077c290, request_rec * 0x00000000, x_cfg * 
0x007b7c18, const char * 0x10014908 `string') line 408 + 15 bytes
x_insert_filter(request_rec * 0x007d2d48) line 997 + 23 bytes
ap_run_insert_filter(request_rec * 0x007d2d48) line 121 + 31 bytes
ap_invoke_handler(request_rec * 0x6ff09466) line 374
ap_process_http_connection(conn_rec * 0x6ff03f8f) line 293 + 6 bytes
ap_run_process_connection(conn_rec * 0x007f51c8) line 85 + 31 bytes
ap_process_connection(conn_rec * 0x007f51c8, void * 0x007f5100) line 211 + 6 
bytes
worker_main(long 2013300156) line 731
MSVCRT! 780085bc()

Another Thread stays in this stack state:

NTDLL! 77894091()
NTDLL! 778922f8()
allocator_alloc(apr_allocator_t * 0x00773dd8, unsigned int 8192) line 242
apr_pool_create_ex(apr_pool_t * * 0x007d51dc, apr_pool_t * 0x007d4d48, int (int)
* 0x00000000, apr_allocator_t * 0x00773dd8) line 829 + 14 bytes
core_output_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007d5198) 
line 4108
ap_pass_brigade(ap_filter_t * 0x007d5198, apr_bucket_brigade * 0x007ed5a0) line 
550 + 7 bytes
ap_http_header_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 
0x007ab1f0) line 1695
ap_pass_brigade(ap_filter_t * 0x007ab1f0, apr_bucket_brigade * 0x007ed408) line 
550 + 7 bytes
ap_content_length_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 
0x007ab1d8) line 1252 + 20 bytes
ap_pass_brigade(ap_filter_t * 0x007ab1d8, apr_bucket_brigade * 0x007ed408) line 
550 + 7 bytes
ap_byterange_filter(ap_filter_t * 0x6ff182c1, apr_bucket_brigade * 0x007ab1c0) 
line 3036 + 5 bytes
ap_pass_brigade(ap_filter_t * 0x007ab1c0, apr_bucket_brigade * 0x007ed408) line 
550 + 7 bytes
ap_old_write_filter(ap_filter_t * 0x007ed3f0, apr_bucket_brigade * 0x007ed528) 
line 1321 + 10 bytes
end_output_stream(request_rec * 0x007aa550) line 1039 + 29 bytes
ap_finalize_request_protocol(request_rec * 0x007aa550) line 1061 + 6 bytes
ap_send_error_response(request_rec * 0x6ff0d26f, int 404) line 2423 + 6 bytes
ap_die(int 1878053609, request_rec * 0x00000000) line 232 + 11 bytes
ap_process_request(request_rec * 0x007aa550) line 311
ap_process_http_connection(conn_rec * 0x6ff03f8f) line 293 + 6 bytes
ap_run_process_connection(conn_rec * 0x007d4e48) line 85 + 31 bytes
ap_process_connection(conn_rec * 0x007d4e48, void * 0x007d4d80) line 211 + 6 
bytes
worker_main(long 2013300156) line 731
MSV

and another Thread looks like this:

allocator_alloc(apr_allocator_t * 0x00773dd8, unsigned int 8192) line 242
apr_pool_create_ex(apr_pool_t * * 0x0156ff28, apr_pool_t * 0x007d6d80, int (int)
* 0x00000000, apr_allocator_t * 0x00773dd8) line 829 + 14 bytes
ap_read_request(conn_rec * 0x6ff09431) line 848
ap_process_http_connection(conn_rec * 0x6ff03f8f) line 286 + 6 bytes
ap_run_process_connection(conn_rec * 0x007d6e80) line 85 + 31 bytes
ap_process_connection(conn_rec * 0x007d6e80, void * 0x007d6db8) line 211 + 6 
bytes
worker_main(long 2013300156) line 731
MSVCRT! 780085bc()

all other threads stays in winnt_get_connection()

So in my oppinion, there is a race condition where one thread tries to free the 
pool memory and another thread tries to access this memory at the same time.
It seems that this behavior can only be reproduced on a multiprocessor or 
hyperthreading machine. I don’t know if it affects to other OS then Windows.

Greetings
Gabriel Kalkuhl

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


Mime
View raw message