tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 28259] New: - Synchronous cluster very slow - select() bug
Date Wed, 07 Apr 2004 12:38:08 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=28259>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28259

Synchronous cluster very slow - select() bug

           Summary: Synchronous cluster very slow - select() bug
           Product: Tomcat 5
           Version: 5.0.21
          Platform: Sun
        OS/Version: Solaris
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Catalina:Cluster
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: rainer.jung@kippdata.de
                CC: rainer.jung@kippdata.de


In a TC cluster with synchronous replication there is a bug in using select() 
and registering operations on selectors.

If I use a high tcpSelectorTimeout (e.g. 1000ms) I can observe the secondary 
node waiting in the midlle of a request until the timeout occurs, although 
replication data is available. As a consequence requests take 1 second to work 
on, although only the TC instances need only 20ms to process the request.
Nearly 1 second is spend waiting for select() to return in the middle of 
processing the request.

I assume this is a bug in the logic handling the registration of operations on 
the selector?

It is not a problem of high load, one can observe using single requests,
but it is necessary to use session atributes in the request.

I expect the same problem to occur in pooled mode. The problem does not 
increase response times in asynchronous mode, but I expect, that it will
increase replication time.

Here is a protocol of what is happening. Most of these log lines are non 
standard, I added them to the code, to find out the flow of control in this 
situation. The intersting thing happens between step 5 and 6.

1) Node 1: Request coming in at 12:39:32,563 :

2004-04-07 12:39:32,563 (http-21080-Processor25)
org.apache.catalina.core.StandardWrapper/DEBUG:   Returning non-STM instance
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG: JspEngine --> /repit.jsp
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:         ServletPath: /repit.jsp
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:            PathInfo: null
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:            RealPath:
/opt/iob3_tomcat_a/webapps/ROOT/repit.jsp
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:          RequestURI: /repit.jsp
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:         QueryString: null
2004-04-07 12:39:32,564 (http-21080-Processor25)
org.apache.jasper.servlet.JspServlet/DEBUG:      Request Params:

2) Node 1: New session message EVT_SESSION_CREATED is being send at 
12:39:32,567:

2004-04-07 12:39:32,566 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Sending 
EVT_SESSION_CREATED:
Sending creation of Session with  id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1
isValid=true isNew=true accessCount=0 maxInactiveInterval=900 
isPrimarySession=true
creationTime=1081334372565 lastAccessedTime=1081334372565
lastTimeReplicated=1081334372565
2004-04-07 12:39:32,566 (http-21080-Processor25)
org.apache.catalina.cluster.tcp.SimpleTcpCluster/DEBUG: Starting
CatalinaCluster.send for msg 1 session 
CD25AAD154CC01843779EA36B143D101.mkbtomcat1
and destination null

3) Node 2: Message is being received at 12:39:32,568, sending ack at 
12:39:32,573:

2004-04-07 12:39:32,568 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen after select returned n=1
2004-04-07 12:39:32,568 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener 
woke
up in listen
2004-04-07 12:39:32,568 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
calling serviceChannel from readDataFromSocket in TcpReplicationThread
2004-04-07 12:39:32,568 (ClusterReceiver)
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
calling notify in serviceChannel
2004-04-07 12:39:32,568 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen before select with tcpSelectorTimeout=1000
2004-04-07 12:39:32,569 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
woke up with key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:32,569 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
woke up and calling drainChannel for key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:32,569 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
Starting drainChannel for key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:32,569 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel inner loop has count 401 and pkgcnt 1
2004-04-07 12:39:32,571 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Manager () Received
SessionMessage of type=SESSION-MODIFIED from
org.apache.catalina.cluster.mcast.McastMember
[tcp://10.254.65.100:5002,10.254.65.100,5002,
alive=230840]
2004-04-07 12:39:32,572 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Created a DeltaSession 
with
Id[A981679B5BD40B418ED57F874DF8E1ED.mkbtomcat2] Total count=5
2004-04-07 12:39:32,573 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Receiving
EVT_SESSION_CREATED: Overwriting new Session with 
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 isValid=true isNew=false
accessCount=0 maxInactiveInterval=900 isPrimarySession=false
creationTime=1081334372572 lastAccessedTime=1081334372572
lastTimeReplicated=1081334372571
2004-04-07 12:39:32,573 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel has pkgcnt 1
2004-04-07 12:39:32,573 (p.TcpReplicationThread[2])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel sending ack for pkgcnt 1

Important: ClusterReceiver calls select() again at 12:39:32,568. This is the 
place
where now all the time goes by.

4) Node 1: Received ack and proceeds with session.access() at 12:39:32,573. Ends
session.access() at 12:39:32,574:

2004-04-07 12:39:32,573 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Created a DeltaSession 
with
Id[CD25AAD154CC01843779EA36B143D101.mkbtomcat1] Total count=5
2004-04-07 12:39:32,574 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/INFO : End of session.access in
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 isNew=false
lastAccessedTime=1081334372565 thisAccessedTime=1081334372574 accessCount=1

5) Node 1: Proceeds working on request until Delta message of type 
EVT_SESSION_DELTA
 is being sent at 12:39:32,578:

2004-04-07 12:39:32,576 (http-21080-Processor25)
org.apache.catalina.cluster.tcp.ReplicationValve/DEBUG: Invoking replication 
request
on /repit.jsp
2004-04-07 12:39:32,577 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/DEBUG: DeltaManager request
Completed sending EVT_SESSION_DELTA  
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1
isValid=true isNew=false accessCount=1 maxInactiveInterval=900 
isPrimarySession=true
creationTime=1081334372565 lastAccessedTime=1081334372565
lastTimeReplicated=1081334372565
2004-04-07 12:39:32,577 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/DEBUG: DeltaManager request
Completed returns with msg non null
2004-04-07 12:39:32,577 (http-21080-Processor25)
org.apache.catalina.cluster.tcp.SimpleTcpCluster/DEBUG: Starting
CatalinaCluster.send for msg 13 session 
CD25AAD154CC01843779EA36B143D101.mkbtomcat1
and destination null
2004-04-07 12:39:32,577 (http-21080-Processor25)
org.apache.catalina.cluster.tcp.SimpleTcpCluster/DEBUG: Setting msgID in
CatalinaCluster.send for msg 13 session 
CD25AAD154CC01843779EA36B143D101.mkbtomcat1
and destination null to ABC9CBA
2004-04-07 12:39:32,578 (http-21080-Processor25)
org.apache.catalina.cluster.tcp.SimpleTcpCluster/DEBUG: Calling
clusterSender.sendMessage for msg 13 session
CD25AAD154CC01843779EA36B143D101.mkbtomcat1 ID ABC9CBA and destination NULL

6) Now there is no activity for 993 milliseconds - which is the problem - until 
1003
milliseconds after the last call to select() in ClusterReceiver on node 2, 
although
there is one message of type EVT_SESSION_DELTA  waiting to be read. Finally the
select returns at 12:39:33,571 but with n=0 and another immediate call to select
()
returns instantaneously with n=1. From here on everything proceeds normal, the
message is being read and ack is sent at 12:39:33,577:

2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen after select returned n=0
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen before select with tcpSelectorTimeout=1000
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen after select returned n=1
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
wokeup in listen
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
calling serviceChannel from readDataFromSocket in TcpReplicationThread
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
calling notify in serviceChannel
2004-04-07 12:39:33,571 (ClusterReceiver)
org.apache.catalina.cluster.tcp.ReplicationListener/DEBUG: ReplicationListener
listen in while doListen before select with tcpSelectorTimeout=1000
2004-04-07 12:39:33,571 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
woke up with key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:33,572 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
woke up and calling drainChannel for key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:33,572 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
Starting drainChannel for key sun.nio.ch.SelectionKeyImpl@b3f9b8
2004-04-07 12:39:33,572 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel inner loop has count 540 and pkgcnt 1
2004-04-07 12:39:33,574 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.SimpleTcpCluster/DEBUG: Received in
clusterSender.messageDataReceived msg 13 session
CD25AAD154CC01843779EA36B143D101.mkbtomcat1
2004-04-07 12:39:33,574 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Manager () Received
SessionMessage of type=SESSION-DELTA from
org.apache.catalina.cluster.mcast.McastMember
[tcp://10.254.65.100:5002,10.254.65.100,5002,
alive=230851]
2004-04-07 12:39:33,576 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.session.DeltaManager/INFO : End of session.access in
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 isNew=false
lastAccessedTime=1081334372572 thisAccessedTime=1081334373575 accessCount=1
2004-04-07 12:39:33,576 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.session.DeltaManager/INFO : End of 
session.endAccess in
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 isNew=false
lastAccessedTime=1081334372572 thisAccessedTime=1081334373575 accessCount=0
2004-04-07 12:39:33,577 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.session.DeltaManager/DEBUG: Receiving 
EVT_SESSION_DELTA:
Updating Session with  id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 
isValid=true
isNew=false accessCount=0 maxInactiveInterval=900 isPrimarySession=false
creationTime=1081334372572 lastAccessedTime=1081334372572
lastTimeReplicated=1081334372571
2004-04-07 12:39:33,577 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel has pkgcnt 1
2004-04-07 12:39:33,577 (p.TcpReplicationThread[1])
org.apache.catalina.cluster.tcp.TcpReplicationThread/DEBUG: TcpReplicationThread
drainChannel sending ack for pkgcnt 1

7) The ack is seen on node 1 at 12:39:33,579 and session.endAccess() ends the
processing of the request

2004-04-07 12:39:33,579 (http-21080-Processor25)
org.apache.catalina.cluster.session.DeltaManager/INFO : End of 
session.endAccess in
id=CD25AAD154CC01843779EA36B143D101.mkbtomcat1 isNew=false
lastAccessedTime=1081334372565 thisAccessedTime=1081334372574 accessCount=0

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Mime
View raw message