activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anthony Enache (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (AMQNET-87) CLONE -Strange Disconnect issues with .Net and VS2005
Date Wed, 21 May 2008 19:30:54 GMT

    [ https://issues.apache.org/activemq/browse/AMQNET-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=42934#action_42934
] 

aenache edited comment on AMQNET-87 at 5/21/08 12:29 PM:
----------------------------------------------------------------

I believe this issue is likely related to a problem that I've found and resolved with the
attached patch.

Environment:   	 Windows XP, VS2005, grabbed Apache.NMS SVN  head as of 5/21, using ActiveMQ
5.1 on server.

Problem:  Commands are issued asynchronously to the broker by client threads.  Client requests
eventually are handled by the transport layer ResponseCorrelator.Request() method.  This method
invokes an asynchronous send of the user command to the broker and receives back a FutureResponse
object which it uses to block indefinitely ( wait time is -1 ) pending receipt of a reply
from the broker.  Replies are received in the method OnCommand() which sets the FutureResponse
signaling the blocked client thread to wake and continue.  Unfortunately, in the event of
a network issue where the client can no longer communicate with the broker, OnCommand() will
never be called, causing client threads to block indefinitely.

The chances that OnCommand() will never be called in the event of a networking issue increase
greatly if the client has registered an exception handler for the connection.  In my particular
usage pattern, my handler closes the broken connection and spawns a separate thread to attempt
to reconnect.  Since the connection has been closed, threads pending a FutureResponse are
effectively leaked by my application. 

All of this will probably become a non issue once the failover transport is implemented, but
for now, I've attached a patch that can be used to get around the problem. 

My approach is to have the ResponseCorrelator override the TransportFilter.OnException() method
to iterate through the requestMap holding pending FutureResponse objects, and provide a BrokerError
response to each, thereby causing the latch to count down and client threads to proceed. 

Hopefully, someone finds the patch useful.



      was (Author: aenache):
    I believe this issue is likely related to a problem that I've found and resolved with
the attached patch.

Environment:   	 Windows XP, VS2005, grabbed Apache.NMS SVN  head as of 5/21, using ActiveMQ
5.1 on server.

Problem:  Upon broker connection problems, sending threads block indefinitely if the connection
is closed.  In my case, I register an exception handler with the connection that closes bad
connections and cleans up ancillary objects created from that connection ( session, consumers,
producers, etc. )  I then spawn a new thread that  attempts to reestablish the connection
to the broker.  As mentioned in the cloned description, from a client perspective, the thread
that is attempting to send a message to a queue registers no exception.  This is due to the
fact that it is pending a response receipt in ResponseCorrelator.cs:94 :

Response response = future.Response;

Since the connection has been closed, no response will ever be forthcoming, the FutureResponse
latch never counts down, and the calling thread never comes back from the wait.  So, we have
a thread leak and no exception is generated that a client could handle.

My approach to fixing / working around the problem is to have the ResponseCorrelator override
the TransportFilter.OnException() method to iterate through the requestMap holding pending
FutureResponse objects, and provide a BrokerError response to each, thereby causing the latch
to count down and client threads to proceed. 

I've attached a patch of this fix. Hopefully, it's useful.


  
> CLONE -Strange Disconnect issues with .Net and VS2005
> -----------------------------------------------------
>
>                 Key: AMQNET-87
>                 URL: https://issues.apache.org/activemq/browse/AMQNET-87
>             Project: ActiveMQ .Net
>          Issue Type: Bug
>          Components: ActiveMQ Client
>         Environment: Windows XP, VS2005, grabbed head of .Net client as of 2/12, useing
ActiveMQ 4.1 on server
>            Reporter: Anthony Enache
>            Assignee: James Strachan
>             Fix For: 1.1
>
>         Attachments: ResponseCorrelator.patch
>
>
> I have a strange issue.  I'm testing bad network conditions, where the client looses
connectivity to the server.  If the server is remote, and the client is inside my lan, and
my lan looses internet connection, at least 1 time in 3 when the consumer gets created (or
at least throws no error messages), but doesn't actually consume messages off of the queue.
 This leads to no error message that I can catch, but the client never recieving another message.
 For reasons that escape me, this happens FAR more frequently when you run this program OUTSIDE
of vs2005.  When I run through the debugger, or even through a "Release" build inside vs2005,
this very rarely occurs.  Also, though this is much more minor, when the client does manage
to reconnect cleanly, I get 2 messages for each one I send, as if the consumer prior to the
internet connection loss isn't being deleted.  However, once I start and stop the Connection,
I get just one message recieved per message sent.  I'll show this in step form below.  Thanks
for your assistance!
> Jamie
> Steps:
> 1.  Run a standard ActiveMQ server on a remote point
> 2.  Run a client inside your lan  (Fixing IP address and port for where the server resides)
> 3.  Click "start" on the app, you'll see the messages your recieving start scrolling
down
> 4.  Yank the cord from your router that gives you an external internet connection
> 5.  After about 8 seconds, the text will tell you the disconnect occured, then replug
in the cord
> 6.  Click "start" again.  might need to do it more then once, as sometimes it's slow
to get a good connection again
> 7.  At least 1 in 3 times after you click "start" it'll clear the text, but you'll never
see another message, even if you "stop" and "start" again (leaving the cord alone)
> 8.  The rest of the time you'll see 2 messages appear for each one sent, after clicking
"stop" and "start" again, you'll see normal behavior.  
> I've attached both the client project I'm using for testing.  I was using the stock ActiveMQ
4.1 server, only changing the IP Address of the TCP connection to point to the correct address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message