activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: DLQ, cause:null
Date Sun, 26 Apr 2015 04:37:27 GMT
James,

The prefetch buffer is a buffer of messages in the ActiveMQ code in the
client process that holds messages that have been dispatched from the
broker to the client but that haven't yet been handed over to the client.
The purpose is to keep some number of messages in memory and available for
the consumer to handle, to ensure that the consumer never has to wait for a
message to be pulled from the broker.  This lets the consumer consume as
quickly as possible.  The broker will continue dispatching messages to the
client until the client has a full prefetch buffer, after which point it
will dispatch one message for every ack it gets back from the consumer.

When you have multiple concurrent consumers, the broker will round-robin
messages between consumer that don't have full prefetch buffers; it won't
batch up a full prefetch buffer's worth for Consumer 1 before sending any
messages to Consumers 2-4, which is what I think you were worried about.

If there are a non-zero number of messages in the client's prefetch buffer
and receive() is called, the thread should simply go grab the first message
in the buffer and return it, so the timeout should not elapse unless the
broker's host is HEAVILY loaded or locking somehow delays thread execution
or you get unlucky enough to catch a full GC right then.  If there are no
messages in the buffer, the thread should wait for the timeout period to
see if a message shows up, and either return that message when it does or
return with no message after the timeout elapses.  In both scenarios, I
would expect that the consumer would remain connected and so no redelivery
would apply; receive() should just be looking to see whether a message came
across the pre-existing connection, but it should not be making connections
nor disconnecting if the timeout interval elapses without a message.  (If
you're disconnecting after each message, that's an anti-pattern as I
understand it, and you should probably rethink your approach.)

All of that is to say, I don't think that the elapsing of a receive()
timeout without receiving a message should do anything that would cause a
message redelivery, so I wonder if that's a red herring and the problem is
actually something else.  Do you see any messages in your client or broker
logs indicating that the client disconnected and reconnected, that the
connection's inactivity monitor detected the connection to be inactive, or
that the consumer was aborted as a slow consumer?

BTW, for your update of that page on the wiki, what client connection
timeout did you have in mind?  (I can think of at least three things that
match that phrase: timeout on establishing the initial connection to the
broker, inactivity on the connection that leads to a disconnect, and the
receive() timeout you referenced above.)  I think that a disconnect due to
connection inactivity where the inactivity monitor was in use would indeed
produce a redelivery, but if you meant the elapsing of a receive() timeout
as described above, I'm not convinced that that's accurate (and if it turns
out not to be, we should pull that edit back off the wiki page to avoid
confusing people).  But one thing that I believe is missing from that list
is when a consumer disconnects from the broker (for any reason) where
messages have been dispatched to the consumer but not acknowledged (i.e.
they're in the consumer's prefetch buffer or they're the message the
consumer was processing at the time of the disconnect under certain
acknowledgement modes).

Tim

On Fri, Apr 24, 2015 at 8:10 AM, James Green <james.mk.green@gmail.com>
wrote:

> I'm need to understand pre-fetch limit and receive time-out interaction.
>
> We have four concurrent consumers in our route. Do each receive the
> messages in batches of the pre-fetch limit?
>
> At what point does the receive time-out start and end?
>
> In our case each client performs a number of db queries then fires a new
> message at the broker before the route is complete. Typically this may take
> more than 1 second under load. A 10s time-out only makes sense if the
> pre-fetching is not included but then that suggests client-calculated
> time-outs communicated back to the broker which also makes no sense.
>
> So, am clear we need to better understand what's under the hood here!
>
> James
>
> On 24 April 2015 at 14:39, Gary Tully <gary.tully@gmail.com> wrote:
>
> > Tim, steady, I suggested it *may* be relevant :-)
> > With camel and transactions - ie: spring dmlc, connection pools and
> > cache levels - anything is possible w.r.t consumer/sessions/connection
> > state, because there are so many variables in the mix.
> >
> > With activemq and prefetch, every consumer disconnect will result in
> > redeliveries. The trick is figuring out whether
> > the prefetched messages were actually delivered to the consumer so the
> > delivery count can reflect the applications view
> > of the world, that is not an exact science.
> >
> >
> > On 24 April 2015 at 13:51, Tim Bain <tbain@alumni.duke.edu> wrote:
> > > Gary,
> > >
> > > If I understood that JIRA correctly, the bug only occurs when the
> client
> > > disconnects, which doesn't sound like what James is doing (nothing in
> his
> > > description indicated to me that his client wasn't staying up and
> > connected
> > > the whole time), so it doesn't sound like your fix would resolve (nor
> > > explain) his problem.  And although I'm all about workarounds when I
> know
> > > there's a fix in a future version, I'm not sure that's the case here
> and
> > I
> > > don't want to give him a workaround at the expense of actually finding
> > and
> > > fixing a bug.
> > >
> > > The two things I know of that can cause message redelivery are 1)
> client
> > > disconnection with queues and durable topic subscriptions and 2)
> > unhandled
> > > exceptions in the client message handler code.  James, might #2 be
> going
> > on
> > > here?  And Gary (or anyone else), are there any other possible causes
> of
> > > redelivery that I don't know about?
> > >
> > > Tim
> > > On Apr 24, 2015 4:59 AM, "Gary Tully" <gary.tully@gmail.com> wrote:
> > >
> > >> to avoid the redelivered messages getting sent to the DLQ, changing
> > >> the default redelivery policy max from 6 to infinite will help.
> > >>
> > >> You can do this in the brokerurl passed to the jms connection factory,
> > >> it may also make sense to reduce the prefetch if consumers come and go
> > >> without consuming the prefetch, which seems to be the case.
> > >>
> > >>
> > >>
> >
> tcp://..:61616?jms.prefetchPolicy.all=100&jms.redeliveryPolicy.maximumRedeliveries=-1
> > >>
> > >> On 23 April 2015 at 17:14, James Green <james.mk.green@gmail.com>
> > wrote:
> > >> > Hi,
> > >> >
> > >> > We are not overriding so the defaults of 1s timeout on the receive()
> > and
> > >> > 1,000 prefetch are in play.
> > >> >
> > >> > We are updating the connection URI to set a much higher timeout.
> > >> >
> > >> > Interestingly, PHP sending to the very same broker via STOMP gets
> > send()
> > >> > fail with a 2 second timeout specified. With a 10 second timeout the
> > >> > frequency of this is reduced.
> > >> >
> > >> > I have fired up the latest hawt.io jar and connected to this
> broker,
> > >> > however the Health and Threads parts are entirely blank. The queues
> > are
> > >> all
> > >> > visible yet "browse" of ActiveMQ.DLQ shows none of the 3,000+
> > accumulated
> > >> > messages. Wondering where to go next?
> > >> >
> > >> > Thanks,
> > >> >
> > >> > James
> > >> >
> > >> >
> > >> > On 23 April 2015 at 13:35, Gary Tully <gary.tully@gmail.com>
wrote:
> > >> >
> > >> >> what sort of timeout is on the receive(...) from spring dmlc,
and
> > what
> > >> >> is the prefetch for that consumer. It appears that the message
is
> > >> >> getting dispatched but not consumed, the connection/consumer dies
> and
> > >> >> the message is flagged as a redelivery. then the before delivery
> > check
> > >> >> on the delivery counter kicks the message to the dlq. So this
must
> be
> > >> >> happening 6 times.
> > >> >>
> > >> >> I just pushed a tidy up of some of the redelivery semantics -
there
> > >> >> was a bug there that would cause the redelivery counter to
> increment
> > >> >> in error... so that may be relevant[1].
> > >> >> A short term solution would be to ensure infinite or a very large
> > >> >> number of redeliveries, up from the default 6. That can be provided
> > in
> > >> >> the broker url.
> > >> >>
> > >> >> [1] https://issues.apache.org/jira/browse/AMQ-5735
> > >> >>
> > >> >> On 23 April 2015 at 13:08, James Green <james.mk.green@gmail.com>
> > >> wrote:
> > >> >> > We have a camel route consuming from ActiveMQ (5.10.0 with
> KahaDB)
> > and
> > >> >> > frequently get a DLQ entry without anything logged through
our
> > >> >> errorHandler.
> > >> >> >
> > >> >> > The only thing we have to go on is a dlqFailureCause header
which
> > >> says:
> > >> >> >
> > >> >> > java.lang.Throwable: Exceeded redelivery policy
> > limit:RedeliveryPolicy
> > >> >> > {destination = null, collisionAvoidanceFactor = 0.15,
> > >> >> maximumRedeliveries =
> > >> >> > 6, maximumRedeliveryDelay = -1, initialRedeliveryDelay =
1000,
> > >> >> > useCollisionAvoidance = false, useExponentialBackOff = false,
> > >> >> > backOffMultiplier = 5.0, redeliveryDelay = 1000}, cause:null
> > >> >> >
> > >> >> > These are happening apparently at random. The route is marked
> > >> transacted,
> > >> >> > and is backed by Spring Transactions itself backed by Narayana.
> > >> >> >
> > >> >> > Our debugging indicates that our route never receives the
message
> > from
> > >> >> AMQ
> > >> >> > prior to it hitting the DLQ. We have switched on DEBUG logging
> for
> > >> >> > org.apache.activemq but other than being swamped with even
more
> > logs
> > >> >> we've
> > >> >> > observed nothing notable.
> > >> >> >
> > >> >> > Any ideas where to go from here? Impossible to say which
of the
> > >> several
> > >> >> > thousand messages per day will go this way so an attached
> debugger
> > is
> > >> out
> > >> >> > of the question.
> > >> >> >
> > >> >> > Our log4j config fragment:
> > >> >> >
> > >> >> >         <Logger name="com" level="WARN"/>
> > >> >> >         <Logger name="org" level="WARN"/>
> > >> >> >         <Logger name="org.apache.camel" level="DEBUG"/>
> > >> >> >         <Logger name="org.apache.activemq" level="DEBUG"/>
> > >> >> >         <Logger name="org.springframework.orm.jpa"
> level="DEBUG"/>
> > >> >> >         <Logger name="org.springframework.transaction"
> > level="DEBUG"/>
> > >> >> >
> > >> >> > Thanks,
> > >> >> >
> > >> >> > James
> > >> >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message