httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frode E. Moe" <fr...@CoreTrek.no>
Subject [users@httpd] Apache fails to detect client closing connection with windows firewall on client
Date Tue, 14 Nov 2006 08:43:33 GMT
(Sorry for a rather long email, here's an "executive summary": Windows firewall
doesn't reply with RST for TCP retransmissions on a client-closed connection,
causes apache workers to get stuck for 5 minutes)

Hello list,
lately I've been trying to track down spurious apparent freezes in an 
application running on Apache + PHP. In short it seems like apache (or 
the kernel?) in some cases fails to detect when a client closes a 
connection mid-download, and leaves the worker stuck for several minutes
trying to write the full http response. The problem is amplified by the 
fact that PHP keeps its session file flock()'ed while this happens,
which means any further requests from the client never get answered
(at least until the stuck worker times out). 

Please note that although this post focuses on PHP, I'm pretty sure this
problem is not specific to that scripting language.

Interestingly this occurs much more frequently if the client is running
the Windows XP SP2 built-in firewall. I'll get back to that shortly.

Here is a small test case:

index.php contains:
  <? session_start(); ?>
  <html>
    <body>
       Test case for hang
       <?
       for ($i=0; $i<100; $i++) {
         echo $i.'<img src="noimg.php?i='.$i.'"><br>';
       }
       ?>
    </body>
  </html>

noimg.php contains:
  <? session_start(); ?>
  <html>
  <body>
  <?
    for ($i=0;$i<1000;$i++) {
      echo "$i: asdf asdf asdf asdf asdfasdf asdf asdf asdf asdf asdf asdf asdf asdf\n";
    }
  ?>
  </body>
  </html>

(I know it's silly to return a text/html page to be loaded in an <img src>, but
that's necessary to cause the client to abort the connection mid-download.
This actually happened in real life due to an erroneous <img src=""> tag 
which caused the browser to load the *current URL* as an image)

My test setup consists of:
  * Server: Apache HTTPd 2.2.3 compiled from source, running on 
    Debian GNU/Linux stable with kernel 2.4.33.3
  * Client: FireFox 2.0 on Windows XP SP2, with the Windows firewall
    enabled.

Server and client are placed on the same LAN.

What happens when pointing firefox to the index.php given above, is
that it starts to load the various <img> tags, but aborts the 
connection for each image mid-download, probably because it detects
the mime type text/html. 

The problem is that after a couple of requests, apache fails to detect 
that the client has closed the connection, so firefox tries to load the
next image, but the previous "image" (PHP script) is still running
and keeping the session locked, so any further requests from the client
just "hangs".

I did a wireshark capture on the client while executing the test case,
and here is an excerpt (I could probably sanitize out passwords etc
and provide a full .pcap file, should that be necessary):

(10.0.0.43 is the server, 10.0.0.138 is the client, '>' marks the most 
interesting packets)

 596   9.729025   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [SYN] Seq=0 Len=0
MSS=1460
 600   9.751507    10.0.0.43 -> 10.0.0.138   TCP 80 2623 80 > 2623 [SYN, ACK] Seq=0
Ack=1 Win=5840 Len=0 MSS=1460
 601   9.751546   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [ACK] Seq=1 Ack=1
Win=64512 Len=0
 602   9.760570   10.0.0.138 -> 10.0.0.43    HTTP 2623 80 GET /noimg.php?i=45 HTTP/1.1
 603   9.762774    10.0.0.43 -> 10.0.0.138   TCP 80 2623 80 > 2623 [ACK] Seq=1 Ack=464
Win=6432 Len=0
 604   9.803347    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a reassembled
PDU]
 605   9.810978    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a reassembled
PDU]
 606   9.811029   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [ACK] Seq=464 Ack=2921
Win=64512 Len=0
>607   9.813999   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [FIN, ACK] Seq=464
Ack=2921 Win=64512 Len=0
 608   9.814405   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [SYN] Seq=0 Len=0
MSS=1460
 609   9.819676    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a reassembled
PDU]
>610   9.819725   10.0.0.138 -> 10.0.0.43    TCP 2623 80 2623 > 80 [RST, ACK] Seq=465
Ack=4381 Win=0 Len=0
 611   9.826157    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP segment of a reassembled
PDU]
 612   9.859980    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Previous segment lost] 80
> 2623 [ACK] Seq=7301 Ack=465 Win=6432 Len=0
>613  10.072413    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>614  10.568147    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>620  11.558401    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
 623  12.785586   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [SYN] Seq=0 Len=0
MSS=1460
 624  12.786740    10.0.0.43 -> 10.0.0.138   TCP 80 2624 80 > 2624 [SYN, ACK] Seq=0
Ack=1 Win=5840 Len=0 MSS=1460
 625  12.786768   10.0.0.138 -> 10.0.0.43    TCP 2624 80 2624 > 80 [ACK] Seq=1 Ack=1
Win=64512 Len=0
 626  12.789319   10.0.0.138 -> 10.0.0.43    HTTP 2624 80 GET /noimg.php?i=46 HTTP/1.1
 627  12.790579    10.0.0.43 -> 10.0.0.138   TCP 80 2624 80 > 2624 [ACK] Seq=1 Ack=464
Win=6432 Len=0
>628  13.569077    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>637  17.570187    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>638  25.575101    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>639  41.563376    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>650 137.533500    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>651 257.507283    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]
>652 377.499848    10.0.0.43 -> 10.0.0.138   TCP 80 2623 [TCP Retransmission] [TCP segment
of a reassembled PDU]

(then things "unlock" and proceeds as normal for a while)

The interesting parts to note here is that the client sends a FIN, ACK, then 
an RST,ACK, and then stays completely silent on port 2623.

netstat -n on the client shows:
  TCP   10.0.0.138:2624    10.0.0.43:80     ESTABLISHED

netstat -np on the server shows:
tcp        0      0 10.0.0.43:80            10.0.0.138:2624         ESTABLISHED  6080/httpd
tcp        1  10220 10.0.0.43:80            10.0.0.138:2623         CLOSE_WAIT   6075/httpd

In other words, the client has completely "forgotten" the port-2623 connection,
but the server still knows about it. 

Attaching to pid 6075 with gdb and running a stacktrace shows:

#0  0x401fba18 in poll () from /lib/libc.so.6
#1  0x40093c78 in apr_wait_for_io_or_timeout (f=0x0, s=0x817f7c0, for_read=0) at support/unix/waitio.c:51
#2  0x4008efef in apr_socket_sendv (sock=0x817f7c0, vec=0xbfffbbf8, nvec=3, len=0xbfffbab8)
at network_io/unix/sendrecv.c:208
#3  0x08073642 in writev_it_all (s=0x817f7c0, vec=0xbfffbbf0, nvec=4, len=8074, nbytes=0xbfffbb48)
at core_filters.c:321
#4  0x08073fea in ap_core_output_filter (f=0x817fdd0, b=0x8185dd8) at core_filters.c:868
#5  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#6  0x0808f91a in ap_http_chunk_filter (f=0x8185f88, b=0x8185dd8) at chunk_filter.c:187
#7  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#8  0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#9  0x080693ae in ap_content_length_filter (f=0x81927a8, b=0x8185dd8) at protocol.c:1338
#10 0x0807ef11 in ap_pass_brigade (next=0x7531, bb=0x1) at util_filter.c:526
#11 0x40048294 in apr_brigade_write (b=0x8185dd8, flush=0x807f050 <ap_filter_flush>,
ctx=0xfffffffc,
    str=0x410e38a8 "218: asdf asdf asdf asdf asdfasdf asdf asdf asdf asdf asdf asdf asdf asdf\n",
nbyte=74) at buckets/apr_brigade.c:400
#12 0x080697cd in buffer_output (r=0x8191a20, str=0x410e38a8 "218: asdf asdf asdf asdf asdfasdf
asdf asdf asdf asdf asdf asdf asdf asdf\n", len=74)
    at protocol.c:1455
#13 0x080698db in ap_rwrite (buf=0xfffffffc, nbyte=74, r=0x7531) at protocol.c:1490
#14 0x40635137 in php_apache_sapi_ub_write (str=0xfffffffc <Address 0xfffffffc out of bounds>,
str_length=74)
    at /devel2/x2www/src/php-5.2.0/sapi/apache2handler/sapi_apache2.c:78

(I won't bore the list with the output of "bt full", but you can get that 
at http://corehacker.com/~frode/apache-user/pollhang-bt-full.txt)

So, apache is stuck in a poll() apparently waiting for the client to suck 
down whatever apache wants to write, but the client is long gone
and we have to wait for the poll() to completely time out before the
worker is freed.

Interestingly enough, if the windows firewall is disabled on 
the client, there is no such long hang, because the client sends RST packets
for each "[TCP Retransmission]" packet, so the socket closes down almost
immediately on the server as well.

I've reproduced exactly the same effect when running httpd 2.2.2 on FreeBSD 6.1,
and also on httpd 1.3.34 (although this seemed to detect the closed client socket quicker)
on the same Linux box. 

I failed to reproduce the effect when running the win xp sp2 + firefox + firewall
client setup inside vmware, strangely enough.

Does anyone have any tips on how to mitigate this problem (besides the obvious
fix of "don't return text/html when the client Accepts: image/png")? 

I don't think "disable the client firewall" is a realistic answer for a 
public-facing web site. Anyway, I've gotten reports that disabling the firewall greatly 
improves things but the occasional hang still occurs. 

Also, isn't this sort of a weakness that makes it fairly easy to create a
Denial of Service situation by "eating up" all workers with little effort?




---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message