activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <rajdav...@gmail.com>
Subject Re: tcp and nio transport considerations
Date Thu, 18 Sep 2008 10:18:44 GMT
would be good to have this optional ;)

On 18 Sep 2008, at 09:36, Manuel Teira Paz wrote:

> Rob Davies escribió:
>> Thanks for the feedback - please add a jira - but we don't generally
>> do releases from branches.
>> Your analysis looks correct to me - can you go through the issues you
>> had with 5.1 ? - might be better to get you on to the 5.1/5.2 release
>> asap
>>
> Yes, we are planning to switch in the near future. However, we  
> didn't have time to test a newer release. The last time we got into  
> 5.1 lots of problems (among them, huge memory leaks) affected us.
>
> Once we have time to go again into the 5.x road, I will try to keep  
> you informed about the bugs we encounter.
>
>
> Any further comment about the need to detach the doConsume part of  
> the TcpTransport run() into a different thread? I'm going to give it  
> a test, just creating an per-instance single thread executor (to  
> avoid any potential out of order annoyances), locally into the run()  
> method (to avoid overhead into TcpTransport instances not being used  
> as listeners).
>
>
> Best regards.
>
> Manuel.
>
>> cheers,
>>
>> Rob
>>
>> On 16 Sep 2008, at 16:13, Manuel Teira Paz wrote:
>>
>>
>>> Hello, I would like to share some thoughts and adventures about tcp
>>> and nio transports to your consideration, hopefully waiting for some
>>> feedback.
>>>
>>> We are using a 4.1 activemq compiled from the 4.1 svn branch. For
>>> some time we didn't run into any important problem, but lately, we
>>> were suffering some issue regarding tcp transport.
>>>
>>> The problem arises when the tcp buffer gets full during a
>>> TcpBufferedOutputStream.flush(). When this happens, and probably
>>> when all the consumers/producers are sharing the same connection, we
>>> run into a deadlock situation, since the socket OutputStream writes
>>> in locking mode. Meanwhile, no reader that could extract some data
>>> from the socket to ease the situation is allowed to do its work,
>>> since it shares the same connection locked in the write attempt. Do
>>> you agree with this analysis and the chance that it could happen?
>>>
>>> As a solution, nio and its non-blocking socket management, selectors
>>> and friends, seemed the way to go. Unfortunately, the nio transport
>>> is not available in the 4.1 branch, but it was easily backported
>>> from the trunk. Trying to use it, some issues arised:
>>>
>>> - Connection attempts were temporized, and the whole system worked
>>> randomly and unresponsible. There were no deadlocks, but one symptom
>>> was that transport.nio.SelectorSelection spent a lot of time waiting
>>> for the socketChannel.register call to complete, in the
>>> SelectorSelection constructor.
>>> I don't know the exact reason, but it seems that
>>> SelectorWorker.run() monopolizes the access to the selector doing:
>>>
>>> while (isRunning()) {
>>> int count = selector.select(10);
>>> if (count == 0) {
>>>  continue;
>>> }
>>>
>>> I didn't have the chance to check if this thread has greater
>>> priority than the one running the SelectorSelection constructor.
>>> Anyway, as a workaround I changed the previous code with:
>>>
>>> int count = selector.select(10);
>>> if (count == 0) {
>>> +   Thread.yield();
>>> continue;
>>> }
>>>
>>> and mostly everything started to work as expected. I was able to
>>> connect consistently to the broker, using a nio:// transport.
>>>
>>> - The remaining problem I found is that a java test client (connect,
>>> sends a message, and closes the connection) didn't close itself
>>> correctly, and it did so using the tcp:// transport. I found two
>>> possible sources for this problem:
>>>
>>> a). NIOTransport doesn't close the selection on doStop. I think
>>> this is needed to allow the SelectorWorker thread to finalize.
>>> b). Even after doing that, and since the
>>> SelectorManager.selectorExecutor is the result of calling
>>> Executors.newCachedThreadPool, the idle threads are not destroyed
>>> inmediatly, but after 60 seconds. Since these threads are created as
>>> non-daemon threads, the VM waits for them to finish. As a
>>> workaround, I changed the instantiation of
>>> SelectorManager.selectorExecutor to:
>>>
>>>  private Executor selectorExecutor =
>>> Executors.newCachedThreadPool(new ThreadFactory() {
>>>      public Thread newThread(Runnable r) {
>>>          Thread rc = new Thread(r);
>>>          rc.setName("NIO Transport Thread");
>>> +            rc.setDaemon(true);
>>>          return rc;
>>>      }
>>>  });
>>>
>>> Hence, avoiding them to be created as non-daemon threads. However, I
>>> suppose this could be dangerous, and something could remain
>>> inconsistent. Another solution could be not to use a
>>> cachedThreadPool, but this could hit the performance. What would be
>>> the best way to avoid the client shutdown delay?
>>>
>>> Currently, changing to 5.1 or 5.2 is not an option for us, since we
>>> run into problems in our previous attempts to switch. We need to
>>> remain (at least while we don't have time enough to run a complete
>>> validation of 5.1 or the upcoming 5.2) with 4.1 and the needed
>>> patches to make it work properly.
>>>
>>> Also, if you want 4.1 to feature NIO support,  I could open a JIRA
>>> issue attaching the patch. Anyway, any idea, comment or proposal
>>> about the problems we run into and the exposed solutions will be
>>> very welcome.
>>>
>>> Best regards.
>>>
>>>
>>> Manuel.
>>>
>>>
>>
>>
>


Mime
View raw message