activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Waite (JIRA)" <jira+amq...@apache.org>
Subject [jira] [Updated] (AMQNET-409) Failover fails under load. Client goes into reconnect loop.
Date Wed, 06 Feb 2013 04:05:13 GMT

     [ https://issues.apache.org/jira/browse/AMQNET-409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rob Waite updated AMQNET-409:
-----------------------------

    Description: 
We were testing failover under heavy load. We have a queue that I placed 500k messages in.
I then have 5 consumers pulling them as fast as they can. The operations are noops essentially...
so messages are consumed in our case at around 500 messages per second with our prod servers.

I would bring up the consumers and they would be consuming away. I would then "kill -9" AMQ
and then observe the client. From the AMQ web admin... they would come back but you would
see the 5 consumers go back and forth between 5 and 0 and comsumption went down from 500/s
to about 3/s.

Looking at AMQ logs... there was a series of errors like this " Async error occurred: javax.jms.JMSException:
Unmatched acknowledge: MessageAck" and then there would be a couple variations of exceptions
regarding the connection being lost (EOFException, broken pipe, etc).

I can provide the full stack traces but I have already found the line that seems to cause
the client to reconnect, fail, disconnect ad infinitum.

In our particular case... there is a line in the NMS source in Connection.cs line 950 that
looks like:


{code:title=Connection.cs|borderStyle=solid}
Tracer.DebugFormat("Connection[{0}]: Async exception with no exception listener: " + error,
this.ConnectionId);
{code}

The error variable was populated with:

"Apache.NMS.NMSConnectionException: Unmatched acknowledge: MessageAck {commandId = 1146, responseRequired
= false, ackType = 2, consumerId = ID:kcccccc-54512-634956893773509011-1:10:3:1, firstMessageId
= ID:qaaaaa-54879-634956774946428300-1:0:1:2:60645, lastMessageId = ID:qaaaaaa-ops01-54879-634956774946428300-1:0:1:2:60645,
destination = queue://AAA.RwwTest.Failover, transactionId = null, messageCount = 1, poisonCause
= null}; Could not find Message-ID ID:qaaaaaa-ops01-54879-634956774946428300-1:0:1:2:60645
in dispatched-list (start of ack)"

Notice that there are curly braces in there... when you try to concat this with the first
parameter in the Tracer.DebugFormat call you get a System.FormatException with the message
"Input string was not in a correct format."

It seems odd that String.Format style logging is used but a string is still concatenated.
This seems to happen many places in the code. I would think that either the Tracer should
handle this exception or trim illegal characters or DebugFormat should not use concatenation
for the first param.


  was:
We were testing failover under heavy load. We have a queue that I placed 500k messages in.
I then have 5 consumers pulling them as fast as they can. The operations are noops essentially...
so messages are consumed in our case at around 500 messages per second with our prod servers.

I would bring up the consumers and they would be consuming away. I would then "kill -9" AMQ
and then observe the client. From the AMQ web admin... they would come back but you would
see the 5 consumers go back and forth between 5 and 0 and comsumption went down from 500/s
to about 3/s.

Looking at AMQ logs... there was a series of errors like this " Async error occurred: javax.jms.JMSException:
Unmatched acknowledge: MessageAck" and then there would be a couple variations of exceptions
regarding the connection being lost (EOFException, broken pipe, etc).

I can provide the full stack traces but I have already found the line that seems to cause
the client to reconnect, fail, disconnect ad infinitum.

In our particular case... there is a line in the NMS source in Connection.cs line 950 that
looks like:

{code:title=Connection.cs|borderStyle=solid}
Tracer.DebugFormat("Connection[{0}]: Async exception with no exception listener: " + error,
this.ConnectionId);
{code}

The error variable was populated with:

"Apache.NMS.NMSConnectionException: Unmatched acknowledge: MessageAck {commandId = 1146, responseRequired
= false, ackType = 2, consumerId = ID:kcccccc-54512-634956893773509011-1:10:3:1, firstMessageId
= ID:qaaaaa-54879-634956774946428300-1:0:1:2:60645, lastMessageId = ID:qawin-ops01-54879-634956774946428300-1:0:1:2:60645,
destination = queue://AAA.RwwTest.Failover, transactionId = null, messageCount = 1, poisonCause
= null}; Could not find Message-ID ID:qawin-ops01-54879-634956774946428300-1:0:1:2:60645 in
dispatched-list (start of ack)"

Notice that there are curly braces in there... when you try to pass this string into the Tracer.DebugFormat
call you get a System.FormatException with the message "Input string was not in a correct
format."

It seems odd that String.Format style logging is used but a string is still concatenated.
This seems to happen many places in the code. I would think that either the Tracer should
handle this exception or DebugFormat should be used properly.


    
> Failover fails under load. Client goes into reconnect loop.
> -----------------------------------------------------------
>
>                 Key: AMQNET-409
>                 URL: https://issues.apache.org/jira/browse/AMQNET-409
>             Project: ActiveMQ .Net
>          Issue Type: Bug
>          Components: ActiveMQ
>    Affects Versions: 1.5.6
>         Environment: Should not matter. AMQ is on Linux, Client is on Windows with .NET
4
>            Reporter: Rob Waite
>            Assignee: Jim Gomes
>
> We were testing failover under heavy load. We have a queue that I placed 500k messages
in. I then have 5 consumers pulling them as fast as they can. The operations are noops essentially...
so messages are consumed in our case at around 500 messages per second with our prod servers.
> I would bring up the consumers and they would be consuming away. I would then "kill -9"
AMQ and then observe the client. From the AMQ web admin... they would come back but you would
see the 5 consumers go back and forth between 5 and 0 and comsumption went down from 500/s
to about 3/s.
> Looking at AMQ logs... there was a series of errors like this " Async error occurred:
javax.jms.JMSException: Unmatched acknowledge: MessageAck" and then there would be a couple
variations of exceptions regarding the connection being lost (EOFException, broken pipe, etc).
> I can provide the full stack traces but I have already found the line that seems to cause
the client to reconnect, fail, disconnect ad infinitum.
> In our particular case... there is a line in the NMS source in Connection.cs line 950
that looks like:
> {code:title=Connection.cs|borderStyle=solid}
> Tracer.DebugFormat("Connection[{0}]: Async exception with no exception listener: " +
error, this.ConnectionId);
> {code}
> The error variable was populated with:
> "Apache.NMS.NMSConnectionException: Unmatched acknowledge: MessageAck {commandId = 1146,
responseRequired = false, ackType = 2, consumerId = ID:kcccccc-54512-634956893773509011-1:10:3:1,
firstMessageId = ID:qaaaaa-54879-634956774946428300-1:0:1:2:60645, lastMessageId = ID:qaaaaaa-ops01-54879-634956774946428300-1:0:1:2:60645,
destination = queue://AAA.RwwTest.Failover, transactionId = null, messageCount = 1, poisonCause
= null}; Could not find Message-ID ID:qaaaaaa-ops01-54879-634956774946428300-1:0:1:2:60645
in dispatched-list (start of ack)"
> Notice that there are curly braces in there... when you try to concat this with the first
parameter in the Tracer.DebugFormat call you get a System.FormatException with the message
"Input string was not in a correct format."
> It seems odd that String.Format style logging is used but a string is still concatenated.
This seems to happen many places in the code. I would think that either the Tracer should
handle this exception or trim illegal characters or DebugFormat should not use concatenation
for the first param.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message