activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Vogelgesang (JIRA)" <jira+amq...@apache.org>
Subject [jira] Commented: (AMQNET-293) Consumer is not recovered after failover reconnect if connection had been lost in OnMessage (before SendACK)
Date Mon, 29 Nov 2010 23:05:14 GMT

    [ https://issues.apache.org/jira/browse/AMQNET-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964996#action_12964996
] 

Jan Vogelgesang commented on AMQNET-293:
----------------------------------------

Timothy, thanks for the info. I am not deep in the NMS code and my patch was just to prove
that there is a problem with the block section. I hoped that it might be enough to fix it,
however, I felt under my skin that it may spoil the delivery logic. Thanks, for analyzing
it.

Another solution which I thought about was not applying failover to sending Acks. That means
duplicates messages may be delivered and I'm fine with that. 
Importan't for me is to have the solution for:
a) reliable transport (I cannot lose any message) 
b) the consumer must recover after connection break (currently it doesn't)

Anyway were you able to reproduce the problem I described ?
If so, maybe you have any idea how to fix it. Please keep me informed.


> Consumer is not recovered after failover reconnect if connection had been lost in OnMessage
(before SendACK)
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQNET-293
>                 URL: https://issues.apache.org/jira/browse/AMQNET-293
>             Project: ActiveMQ .Net
>          Issue Type: Bug
>          Components: NMS
>    Affects Versions: 1.4.0, 1.4.1
>         Environment: Windows XP - however I believe it's not dependent on the operating
system
>            Reporter: Jan Vogelgesang
>            Assignee: Timothy Bish
>             Fix For: 1.5.0
>
>         Attachments: MessageConsumerDirtyPatch.cs, UnitTestAndPatch.zip
>
>
> To reproduce the error write the simple WinForm application in C# with listbox (or console
app)
> 1. Create the connection: failover:(tcp://localhost:61616?keepAlive=true) and start it.

> 2. Create the session and the QueueConsumer for queue e.g. "TestQueue" (in default AutoAcknoledged
mode)
> 3. For queue consumer set the message listener e.g. OnMessage
> 4. In OnMessage method do something like Sleep(5000) and then display the received textMessage
(by Invoke add the messge to listbox).
> 5. By means of localhost:8161/admin create the TestQueue and put to this queue about
20 persistent text messages.
> 6. Run the application. You should get on the screen new line every 5 seconds.
> 7. Restart the ActiveMQ broker (I'm using 5.3.0.5)
> 8. After restarting the broker you stop receiving the messages.
> 9. Restart broker again. And you will start getting the messages.
> ----------
> The problem is in NMS.
> Most likely when you restart ActiveMQ broker the client app will be in OnMessage method
(just sleeping there for 5 seconds). When those 5 seconds is over then the NMS is trying to
SendACK. And this method will not end until failover thread successfully reconnect.. For that
time there is a lock on the unconsumedMessages.SynchRoot (see MessageConsumer.Dispatch method).
And this fact is painful for another thread which is trying to do the unconsumedMessage.Clear()
and needs the locked resource. (this thread is initated in Connection.OnTransportInterrupted()
and it wants to call ClearMessagesInProgress on MessageConsumer).
> The worst thing is that the MessageConsumer.ClearMessagesInProgress method cannot call
(as waits for locked resource) TransportInterruptionProcessingComplete() which I guess registers
consumers which have to be recovered when the connection is back.
> So when the connection is back:
> SendAck completes
> Failover thread DoRecover but does not find our consumer.
> Dispatch method unlocks the unconsumedMessages.SynchRoot
> The working thread registers consumers to be recovered (absolutely too late!!!!)
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message