activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Teira Paz <mte...@tid.es>
Subject tcp and nio transport considerations
Date Tue, 16 Sep 2008 15:13:20 GMT
Hello, I would like to share some thoughts and adventures about tcp and 
nio transports to your consideration, hopefully waiting for some feedback.

We are using a 4.1 activemq compiled from the 4.1 svn branch. For some 
time we didn't run into any important problem, but lately, we were 
suffering some issue regarding tcp transport.

The problem arises when the tcp buffer gets full during a 
TcpBufferedOutputStream.flush(). When this happens, and probably when 
all the consumers/producers are sharing the same connection, we run into 
a deadlock situation, since the socket OutputStream writes in locking 
mode. Meanwhile, no reader that could extract some data from the socket 
to ease the situation is allowed to do its work, since it shares the 
same connection locked in the write attempt. Do you agree with this 
analysis and the chance that it could happen?

As a solution, nio and its non-blocking socket management, selectors and 
friends, seemed the way to go. Unfortunately, the nio transport is not 
available in the 4.1 branch, but it was easily backported from the 
trunk. Trying to use it, some issues arised:

- Connection attempts were temporized, and the whole system worked 
randomly and unresponsible. There were no deadlocks, but one symptom was 
that transport.nio.SelectorSelection spent a lot of time waiting for the 
socketChannel.register call to complete, in the SelectorSelection 
constructor.
  I don't know the exact reason, but it seems that SelectorWorker.run() 
monopolizes the access to the selector doing:

while (isRunning()) {
  int count = selector.select(10);
  if (count == 0) {
    continue;
  }

  I didn't have the chance to check if this thread has greater priority 
than the one running the SelectorSelection constructor. Anyway, as a 
workaround I changed the previous code with:

int count = selector.select(10);
if (count == 0) {
+   Thread.yield();
  continue;
}

  and mostly everything started to work as expected. I was able to 
connect consistently to the broker, using a nio:// transport.

- The remaining problem I found is that a java test client (connect, 
sends a message, and closes the connection) didn't close itself 
correctly, and it did so using the tcp:// transport. I found two 
possible sources for this problem:

   a). NIOTransport doesn't close the selection on doStop. I think this 
is needed to allow the SelectorWorker thread to finalize.
   b). Even after doing that, and since the 
SelectorManager.selectorExecutor is the result of calling 
Executors.newCachedThreadPool, the idle threads are not destroyed 
inmediatly, but after 60 seconds. Since these threads are created as 
non-daemon threads, the VM waits for them to finish. As a workaround, I 
changed the instantiation of SelectorManager.selectorExecutor to:

    private Executor selectorExecutor = 
Executors.newCachedThreadPool(new ThreadFactory() {
        public Thread newThread(Runnable r) {
            Thread rc = new Thread(r);
            rc.setName("NIO Transport Thread");
+            rc.setDaemon(true);
            return rc;
        }
    });

Hence, avoiding them to be created as non-daemon threads. However, I 
suppose this could be dangerous, and something could remain 
inconsistent. Another solution could be not to use a cachedThreadPool, 
but this could hit the performance. What would be the best way to avoid 
the client shutdown delay?

Currently, changing to 5.1 or 5.2 is not an option for us, since we run 
into problems in our previous attempts to switch. We need to remain (at 
least while we don't have time enough to run a complete validation of 
5.1 or the upcoming 5.2) with 4.1 and the needed patches to make it work 
properly.

Also, if you want 4.1 to feature NIO support,  I could open a JIRA issue 
attaching the patch. Anyway, any idea, comment or proposal about the 
problems we run into and the exposed solutions will be very welcome.

Best regards.


Manuel.


Mime
View raw message