activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <rajdav...@gmail.com>
Subject Re: tcp and nio transport considerations
Date Tue, 16 Sep 2008 15:26:55 GMT
Thanks for the feedback - please add a jira - but we don't generally  
do releases from branches.
Your analysis looks correct to me - can you go through the issues you  
had with 5.1 ? - might be better to get you on to the 5.1/5.2 release  
asap

cheers,

Rob

On 16 Sep 2008, at 16:13, Manuel Teira Paz wrote:

> Hello, I would like to share some thoughts and adventures about tcp  
> and nio transports to your consideration, hopefully waiting for some  
> feedback.
>
> We are using a 4.1 activemq compiled from the 4.1 svn branch. For  
> some time we didn't run into any important problem, but lately, we  
> were suffering some issue regarding tcp transport.
>
> The problem arises when the tcp buffer gets full during a  
> TcpBufferedOutputStream.flush(). When this happens, and probably  
> when all the consumers/producers are sharing the same connection, we  
> run into a deadlock situation, since the socket OutputStream writes  
> in locking mode. Meanwhile, no reader that could extract some data  
> from the socket to ease the situation is allowed to do its work,  
> since it shares the same connection locked in the write attempt. Do  
> you agree with this analysis and the chance that it could happen?
>
> As a solution, nio and its non-blocking socket management, selectors  
> and friends, seemed the way to go. Unfortunately, the nio transport  
> is not available in the 4.1 branch, but it was easily backported  
> from the trunk. Trying to use it, some issues arised:
>
> - Connection attempts were temporized, and the whole system worked  
> randomly and unresponsible. There were no deadlocks, but one symptom  
> was that transport.nio.SelectorSelection spent a lot of time waiting  
> for the socketChannel.register call to complete, in the  
> SelectorSelection constructor.
> I don't know the exact reason, but it seems that  
> SelectorWorker.run() monopolizes the access to the selector doing:
>
> while (isRunning()) {
> int count = selector.select(10);
> if (count == 0) {
>   continue;
> }
>
> I didn't have the chance to check if this thread has greater  
> priority than the one running the SelectorSelection constructor.  
> Anyway, as a workaround I changed the previous code with:
>
> int count = selector.select(10);
> if (count == 0) {
> +   Thread.yield();
> continue;
> }
>
> and mostly everything started to work as expected. I was able to  
> connect consistently to the broker, using a nio:// transport.
>
> - The remaining problem I found is that a java test client (connect,  
> sends a message, and closes the connection) didn't close itself  
> correctly, and it did so using the tcp:// transport. I found two  
> possible sources for this problem:
>
>  a). NIOTransport doesn't close the selection on doStop. I think  
> this is needed to allow the SelectorWorker thread to finalize.
>  b). Even after doing that, and since the  
> SelectorManager.selectorExecutor is the result of calling  
> Executors.newCachedThreadPool, the idle threads are not destroyed  
> inmediatly, but after 60 seconds. Since these threads are created as  
> non-daemon threads, the VM waits for them to finish. As a  
> workaround, I changed the instantiation of  
> SelectorManager.selectorExecutor to:
>
>   private Executor selectorExecutor =  
> Executors.newCachedThreadPool(new ThreadFactory() {
>       public Thread newThread(Runnable r) {
>           Thread rc = new Thread(r);
>           rc.setName("NIO Transport Thread");
> +            rc.setDaemon(true);
>           return rc;
>       }
>   });
>
> Hence, avoiding them to be created as non-daemon threads. However, I  
> suppose this could be dangerous, and something could remain  
> inconsistent. Another solution could be not to use a  
> cachedThreadPool, but this could hit the performance. What would be  
> the best way to avoid the client shutdown delay?
>
> Currently, changing to 5.1 or 5.2 is not an option for us, since we  
> run into problems in our previous attempts to switch. We need to  
> remain (at least while we don't have time enough to run a complete  
> validation of 5.1 or the upcoming 5.2) with 4.1 and the needed  
> patches to make it work properly.
>
> Also, if you want 4.1 to feature NIO support,  I could open a JIRA  
> issue attaching the patch. Anyway, any idea, comment or proposal  
> about the problems we run into and the exposed solutions will be  
> very welcome.
>
> Best regards.
>
>
> Manuel.
>


Mime
View raw message