tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From d...@hornig.net (Douglas E. Hornig)
Subject Re: Solaris Performance Problem
Date Thu, 14 Jun 2001 10:31:30 GMT
For those of you who remember my question from several weeks ago regarding a response time
delay between tomcat running standalone on Solaris and a Windows client.  I modified my own
tomcat 3.2.2 as Gal suggested and the delay is gone.

I'll leave this to the pros to decide how to incorporate this into tomcat properly.

Thanks to everyone who helped me out with this problem (especially Gal!)

Doug


--- Forwarded message from Gal Shalif - Sun Israel - Software Engineer ---

>From: Gal Shalif - Sun Israel - Software Engineer <gal.shalif@israel.sun.com>
>Date: Sun, 10 Jun 2001 14:47:44 +0300 (IDT)
>To: "Douglas E. Hornig" <doug@hornig.net>
>Subject: Re: Solaris Performance Problem

Hello,

It should be fixed with a call to Socket.setTcpNoDelay() on the Tomcat
server socket.
A C program that I wrote had the same problem as the Tomcat server.
and the problem was fixed with the setting of the TCP NODELAY
transmission.

Anybody there who volunteer to add this option to the Tomcat Java
server code?

My analysis of the problem is attached at the end of this mail.

> >From: doug@hornig.net (Douglas E. Hornig)
> Subject: Solaris Performance Problem
> To: tomcat-user@jakarta.apache.org
> 
> I have run into a strance performance problem with Tomcat that I am at a loss to figure
out.  In a nutshell, when I run Tomcat (version 3.2.1, standalone) on a Solaris Sparc server
and hit it with a Windows client, the minimum response time is is abount 0.2 seconds.  This
is for a simple servlet that just returns some static data.  That may not sound like a lot
of time but if a client makes several requests to the server it can add up fast.  If I run
the server on a Linux PC that number is about 0.01 seconds or less.  I also get good performce
if I use a Linux client and hit the Solaris server.
> 
> I have tried using different PC clients (different PCs, one running NT 4 and one with
Windows 2000), different test programs (one in Java and one in Visual Basic) and it makes
no difference.
> 
> I have tried using different Solaris servers, a 420R and an Ultra 5, neither with any
kind of load.  I have tried different JDKs (1.2.2 and 1.3.0).  No difference.
> 
> I wrote a quickie standalone Java server for the Sparcs and their response time was very
fast (< 0.01 secs).  The server reads a request, sends a reply, and closes the socket much
like a web server would.
> 
> So the problem combination seems to be Solaris server with Tomcat and a PC client.  I
know this sounds nuts but it's  totally repeatable and a very serious problem.  Does anyone
have an idea what I can do to get to the bottom of this?
> 
> Thanks a lot.
> 
> Douglas Hornig
> Dartmouth-Hitchcock Medical Center
> Lebanon, NH


----------------------------------------------------
slow response time with Solaris 8 server and 
Windows 2000 TCP client as tested with Tomcat and IE
----------------------------------------------------

Problem description:
--------------------
  Tomcat Hello World examples (/examples/servlet/HelloWorldExample) 
  divide the response into two parts:

    HTTP protocol header
    HTML data
  
  At the Solaris TCP level, the second part of the response will not be send
  till the Windows client acknowledge (ACK) the first part.
  The result: a delay of ~170ms (dependent on client/server configuration)
  
  The configuration in problem:
  -----------------------------
    Solaris 8 running Tomcat 3.2.1 server
    Windows 2000 client machine running a web client

  And the raw data:
  -----------------
    The output was gathered with:
      truss -fdDla -o /tmp/tmp.dat -p 5732
    On a Solaris machine:
        SunOS node2mde 5.8 Generic_108528-05 sun4u sparc SUNW,Ultra-60
        root nexus = Sun Enterprise 220R (2 X UltraSPARC-II 450MHz)

    Tomcat server (on the Solaris 8) send two sends within a few milliseconds:

      time stamp    duration    system-call and its arguments
       (seconds)    (seconds)

      100759.3625    0.0009     send(14, "HTTP/1.0 200"  ...., 179, 0)  = 179
      100759.3629    0.0004     send(14, "<html>\n<head" ...., 400, 0)  = 400

    The Windows 2000 client receive the second part about 170ms after the first
    The delay the transfer of the second "send" till it receive an acknowledge
    for the first.

    
  
Analyze:
--------
  This is a generic Solaris server Windows client TCP problem that
  is not specific to Tomcat. The originate of the problem is probably
  an obscure Windows TCP implementation, however, the Solaris TCP 
  implementation should be able to recognize a Windows client on its LAN
  and to modify its TCP parameters accordingly to improve performance.

  The Solaris OS try to optimize the transfer of data to its client by
  combining a few sends/write of small amount of data each into one
  data transfer. If the client machine is slow to acknowledge the first
  transfer (i.e. Windows client), then the inevitable consequence 
  is a "delay transfer": non of the small bits of data
  will be transferred to the client till the Solaris gather enough
  small bits for one big transfer or till the client acknowledge the
  first transfer.
  This optimization is good for connections with high latency (Internet, 
  WAN) that has a big overhead per one transfer but is very bad for LAN 
  where the latency is small and the overhead of one transfer is small
  and the client machine is slow to acknowledge the first transfer.

  The bottom line is:
    When the latency is small (LAN) and the client 
    machine run windows (NT, Win2000) then the "delay transfer" 
    optimization introduce delays and the TCP default should be set
    to "no delay".

Fix:
----
  For C/C++ programs, the server socket option should be set
  to TCP_NODELAY.
  Example:
    int just_say_no = 1;
    setsockopt(s, IPPROTO_TCP, TCP_NODELAY, (void *)&just_say_no, sizeof(int));
  For Java programs (i.e. Tomcat) , a similar attribute should be used.
  For more information see: Socket.setTcpNoDelay()


Solaris 8 "man tcp" page about TCP_NODELAY:
-------------------------------------------

     Under  most  circumstances,  TCP  sends  data  when  it   is
     presented.  When  outstanding  data  has  not  yet been ack-
     nowledged,
      TCP gathers small amounts of output to be sent in a  single
     packet  once  an  acknowledgment  has  been received. For a
     small number of clients, such as window systems that send  a
     stream  of mouse events which receive no replies, this pack-
     etization may cause significant delays. To  circumvent  this
     problem,    TCP  provides  a  socket-level  boolean  option,
     TCP_NODELAY. TCP_NODELAY is defined in <netinet/tcp.h>,  and
     -----------
     is   set   with    setsockopt(3SOCKET)   and   tested   with
     getsockopt(3SOCKET). The option level for  the  setsockopt()
     call   is  the  protocol  number  for  TCP,  available  from
     getprotobyname(3SOCKET).


And Java documentation about TCP_NODELAY (from the Java 1.3.1 docs):
--------------------------------------------------------------------
  http://java.sun.com/j2se/1.3/docs/api/java/net/SocketOptions.html


    TCP_NODELAY
    
    public static final int TCP_NODELAY
    
          Disable Nagle's algorithm for this connection. Written data to the
          network is not buffered pending acknowledgment of previously
          written data.
    
          Valid for TCP only: SocketImpl. 
    
          See Also: 
                Socket.setTcpNoDelay(boolean), Socket.getTcpNoDelay()




APPENDIX A: Win200 ie client and Solaris Tomcat Server
------------------------------------------------------

  Tomcat on Solaris 8 server (Ultra10) and a Win200 client (laptop).
  A 200ms delay beween the Solaris send of the HTTP header and the
  Win200 acknoledge: the Win200 client ACK the first Tomcat
  transfer (HTTP header - HTTP/1.0 200 OK ...) with 200ms delay.

  Here is a snoop of the transfers:
  ---------------------------------

    capture:
    --------

    # snoop -o /tmp/ariel.snoop -d hme0 host ariel
   
    summary:
    --------

    # snoop -t a -S -i /tmp/ariel.snoop '(host ariel and host galfield)'

       1 16:32:45.02819        ariel -> galfield     length:   62  HTTP (proxy) C port=2076

       2 16:32:45.02822     galfield -> ariel        length:   62  HTTP (proxy) R port=2076

       3 16:32:45.02847        ariel -> galfield     length:   60  HTTP (proxy) C port=2076

       4 16:32:45.02933        ariel -> galfield     length:  420  HTTP GET /examples/servlet/HelloWorldExample
HTTP/1.1

       5 16:32:45.02943     galfield -> ariel        length:   54  HTTP (proxy) R port=2076

       6 16:32:45.03498     galfield -> ariel        length:  226  HTTP HTTP/1.0 200 OK

   ->  7 16:32:45.23190        ariel -> galfield     length:   60  HTTP (proxy) C port=2076

       8 16:32:45.23194     galfield -> ariel        length:  454  HTTP (body)
       9 16:32:45.23249        ariel -> galfield     length:   60  HTTP (proxy) C port=2076

      10 16:32:45.24947        ariel -> galfield     length:   60  HTTP (proxy) C port=2076

      11 16:32:45.24950     galfield -> ariel        length:   54  HTTP (proxy) R port=2076

    # 
    
APPENDIX B: NT netscape client and Solaris Tomcat Server
--------------------------------------------------------

    Simmilar to Win200 ie client.


# snoop -t a -S -i /tmp/karish.snoop
       1 19:24:10.87380       karish -> galfield     length:   60  HTTP (proxy) C port=1469

       2 19:24:10.87384     galfield -> karish       length:   58  HTTP (proxy) R port=1469

       3 19:24:10.87402       karish -> galfield     length:   60  HTTP (proxy) C port=1469

       4 19:24:10.87735       karish -> galfield     length:  373  HTTP GET /examples/servlet/HelloWorldExample
HTTP/1.0

       5 19:24:10.87745     galfield -> karish       length:   54  HTTP (proxy) R port=1469

       6 19:24:10.88701     galfield -> karish       length:  226  HTTP HTTP/1.0 200 OK

   ->  7 19:24:11.03993       karish -> galfield     length:   60  HTTP (proxy) C port=1469

       8 19:24:11.03996     galfield -> karish       length:  454  HTTP (body)
       9 19:24:11.04024       karish -> galfield     length:   60  HTTP (proxy) C port=1469

      10 19:24:11.09155       karish -> galfield     length:   60  HTTP (proxy) C port=1469

      11 19:24:11.09158     galfield -> karish       length:   54  HTTP (proxy) R port=1469



APPENDIX C: Solaris netscape client and Solaris Tomcat Server
-------------------------------------------------------------

  Tomcat on Solaris 8 server (Ultra10) and a Solaris 2.6 client (Ultra1).
  The delay does not exist: the Solaris client ACK the first Tomcat
  transfer (HTTP header - HTTP/1.0 200 OK ...) without delay and the 
  second transfer (HTML body) follow emediately.
  The output from snoop proof it.

  Here is a snoop of the transfers:
  ---------------------------------

    capture:
    --------

    # snoop -o /tmp/galfield.snoop -d hme0 host emek
   
    summary:
    --------

    # snoop -t a -S -i /tmp/galfield.snoop '(host emek and host galfield)'

      1 18:29:7.79029         emek -> galfield     length:   60  HTTP (proxy) C port=50570

      2 18:29:7.79098     galfield -> emek         length:   58  HTTP (proxy) R port=50570

      3 18:29:7.79133         emek -> galfield     length:   60  HTTP (proxy) C port=50570

      4 18:29:7.79853         emek -> galfield     length:  461  HTTP GET /examples/servlet/HelloWorldExample
HTTP/1.0

      5 18:29:7.79864     galfield -> emek         length:   54  HTTP (proxy) R port=50570

      6 18:29:7.80270     galfield -> emek         length:  226  HTTP HTTP/1.0 200 OK

  ->  7 18:29:7.80353         emek -> galfield     length:   60  HTTP (proxy) C port=50570

      8 18:29:7.80356     galfield -> emek         length:  454  HTTP (body)
      9 18:29:7.81623         emek -> galfield     length:   60  HTTP (proxy) C port=50570

     10 18:29:7.87945         emek -> galfield     length:   60  HTTP (proxy) C port=50570

     11 18:29:7.87948     galfield -> emek         length:   54  HTTP (proxy) R port=50570

    # 
    

APPENDIX D: Windows NT netscape client and Solaris simple web Server
--------------------------------------------------------------------

  The delay exist with the default TCP options (TCP_NODELAY is 0)
  but disapear when the server socket option TCP_NODELAY is set to 1:
    int just_say_no = nodelay;
    setsockopt(s, IPPROTO_TCP, TCP_NODELAY, (void *)&just_say_no, sizeof(int));

  Here is a snoop of the transfers:
  ---------------------------------

    with the NODELAY on (default setting):
    --------------------------------------

       1 12:06:6.56395       karish -> galfield     length:   60  INGRESLOCK R port=1102

       2 12:06:6.56399     galfield -> karish       length:   58  INGRESLOCK C port=1102

       3 12:06:6.56417       karish -> galfield     length:   60  INGRESLOCK R port=1102

       4 12:06:6.57547       karish -> galfield     length:  355  INGRESLOCK R port=1102
GET /examples/servle
       5 12:06:6.57557     galfield -> karish       length:   54  INGRESLOCK C port=1102

       6 12:06:6.57579     galfield -> karish       length:  226  INGRESLOCK C port=1102
HTTP/1.0 200 OK\r\nCon
   ->  7 12:06:6.68511       karish -> galfield     length:   60  INGRESLOCK R port=1102

       8 12:06:6.68514     galfield -> karish       length:  448  INGRESLOCK C port=1102
<html>\n<head>\n<title
       9 12:06:6.68542       karish -> galfield     length:   60  INGRESLOCK R port=1102

      10 12:06:6.71675       karish -> galfield     length:   60  INGRESLOCK R port=1102

      11 12:06:6.71677     galfield -> karish       length:   54  INGRESLOCK C port=1102


    with the NODELAY off:
    ---------------------

       1 12:09:51.75895       karish -> galfield     length:   60  TCP D=1102 S=1526 Syn
Seq=2670688616 Len=0 Win=8192 Options=<mss 1460>
       2 12:09:51.75899     galfield -> karish       length:   58  TCP D=1526 S=1102 Syn
Ack=2670688617 Seq=197375344 Len=0 Win=24820 Options=<mss 1460>
       3 12:09:51.75917       karish -> galfield     length:   60  TCP D=1102 S=1526  
  Ack=197375345 Seq=2670688617 Len=0 Win=8760
       4 12:09:51.76217       karish -> galfield     length:  373  TCP D=1102 S=1526  
  Ack=197375345 Seq=2670688617 Len=319 Win=8760
       5 12:09:51.77097     galfield -> karish       length:   54  TCP D=1526 S=1102  
  Ack=2670688936 Seq=197375345 Len=0 Win=24820
       6 12:09:51.77123     galfield -> karish       length:  226  TCP D=1526 S=1102  
  Ack=2670688936 Seq=197375345 Len=172 Win=24820
       7 12:09:51.77131     galfield -> karish       length:  448  TCP D=1526 S=1102  
  Ack=2670688936 Seq=197375517 Len=394 Win=24820
       8 12:09:51.77147     galfield -> karish       length:   54  TCP D=1526 S=1102 Fin
Ack=2670688936 Seq=197375911 Len=0 Win=24820
       9 12:09:51.77155       karish -> galfield     length:   60  TCP D=1102 S=1526  
  Ack=197375911 Seq=2670688936 Len=0 Win=8194
      10 12:09:51.77160       karish -> galfield     length:   60  TCP D=1102 S=1526  
  Ack=197375912 Seq=2670688936 Len=0 Win=8194
      11 12:09:51.81374       karish -> galfield     length:   60  TCP D=1102 S=1526 Fin
Ack=197375912 Seq=2670688936 Len=0 Win=8194
      12 12:09:51.81377     galfield -> karish       length:   54  TCP D=1526 S=1102  
  Ack=2670688937 Seq=197375912 Len=0 Win=24820


Other Resources:
----------------
* Apache setting of the server socket options:
  http://httpd.apache.org/dist/apache_1.3.19.tar.gz
     /src/main/http_main.c

Mime
View raw message