activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: Thousands of workers waiting to receive jobs but ActiveMQ isn’t sending them.
Date Wed, 11 Mar 2015 19:56:14 GMT
There isn't a stock slow consumer strategy.  If you didn't configure it,
you don't have one.

On Wed, Mar 11, 2015 at 1:49 PM, Kevin Burton <burton@spinn3r.com> wrote:

> I didn’t configure one… so we’re running the stock one. I’m trying to
> verify it it’s just a specific task not committing messages but for some
> reason it processes them in batches.
>
> If it wasn’t committing the messages I would expect it to lock up
> permanently.
>
> One thing that could be nice for ActiveMQ is a way to easily write a report
> on potential problems with the stack.
>
> So for example, if you have slow consumers or consumers that have been
> pending an ack for a LONG time this would be printed in the report.
>
> This way if the report comes back with *nothing* you can be fairly certain
> it’s a bug in your code.
>
> But right now what happens is that ActiveMQ refuses to deliver messages and
> I have no easy way to tell *why*.
>
>
>
> On Wed, Mar 11, 2015 at 12:40 PM, Tim Bain <tbain@alumni.duke.edu> wrote:
>
> > Out of curiosity, which slow consumer strategy did you configure?
> >
> > On Wed, Mar 11, 2015 at 11:45 AM, Kevin Burton <burton@spinn3r.com>
> wrote:
> >
> > > The problem is that this is happening in production but not locally.
> > >
> > > Setting prefetchPolicy to zero fixed it for one of our tasks, but
> another
> > > one of our tasks is still waiting for work and I can’t figure out why.
> > >
> > > It has 6000 consumers / threads, 200k messages waiting to be processed,
> > but
> > > none are being executed.
> > >
> > > The workers themselves are all waiting for ActiveMQ to send the
> messages.
> > >
> > > And none of our connections are marked slow (I think, still verifying
> but
> > > it takes forever).
> > >
> > > On Wed, Mar 11, 2015 at 5:33 AM, Tim Bain <tbain@alumni.duke.edu>
> wrote:
> > >
> > > > I'd definitely set breakpoints and step through with a debugger to
> try
> > to
> > > > figure out what's going on.
> > > > On Mar 10, 2015 8:19 PM, "Kevin Burton" <burton@spinn3r.com> wrote:
> > > >
> > > > > This is exceedingly bizarre.  Now ActiveMQ is refusing to deliver
> ANY
> > > > > messages to my workers.
> > > > >
> > > > > This is very bizarre, no code has changed.  Nothing.  It’s just
> > > refusing
> > > > to
> > > > > give work.
> > > > >
> > > > > If I set the prefetch to 0 or 1, it does work for a few moments,
> then
> > > > > halts.
> > > > >
> > > > > 99% certain I’m committing all my messages.  As it would make sense
> > > that
> > > > > nothing could be processed after that of course.
> > > > >
> > > > > Kevin
> > > > >
> > > > >
> > > > > On Tue, Mar 10, 2015 at 6:53 PM, Tim Bain <tbain@alumni.duke.edu>
> > > wrote:
> > > > >
> > > > > > If you make a single consumer, you'll only get one message at
a
> > time
> > > by
> > > > > > default (so only one thread will be doing any work).  You'd
have
> to
> > > use
> > > > > > client acknowledgement or selective acknowledgement to get more
> > than
> > > > one
> > > > > > message at a time.  I'd probably leave many consumers but tune
> down
> > > > your
> > > > > > prefetch buffers to something relatively small to ensure that
the
> > > > > workload
> > > > > > is evenly spread and you don't have some consumers with a large
> > > > prefetch
> > > > > > buffer worth of backlog while others sit around idle.
> > > > > >
> > > > > > But if you're seeing the broker report pending messages, then
> that
> > > > means
> > > > > > that having unbalanced workloads due to large prefetch buffers
> > isn't
> > > > your
> > > > > > problem...  Pending messages on the broker only occur when those
> > > > messages
> > > > > > can't be dispatched to any consumer because all of their prefetch
> > > > buffers
> > > > > > are full, which would mean that you don't have unbalanced
> > workloads.
> > > > > > You'll get one or the other, not both.
> > > > > >
> > > > > > On Tue, Mar 10, 2015 at 7:42 PM, Kevin Burton <
> burton@spinn3r.com>
> > > > > wrote:
> > > > > >
> > > > > > > I’m actually wondering if this is my issues. I’m creating
one
> > > session
> > > > > per
> > > > > > > thread.  So perhaps some of the threads have work to do,
but
> > > they’re
> > > > > each
> > > > > > > prefetching a bunch of work when in reality a better strategy
> > might
> > > > the
> > > > > > to
> > > > > > > have one master listener and then dispatch messages to
each
> > thread.
> > > > > > >
> > > > > > > On Tue, Mar 10, 2015 at 6:37 PM, Kevin Burton <
> > burton@spinn3r.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > OK.  That’s good to know.  I have a large number
of
> connections
> > > so
> > > > I
> > > > > > have
> > > > > > > > to look at each one.  I wonder if this could also
be the
> issue.
> > > > AKA
> > > > > > too
> > > > > > > > many connections.
> > > > > > > >
> > > > > > > > On Tue, Mar 10, 2015 at 6:19 PM, Tim Bain <
> > tbain@alumni.duke.edu
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> You should be able to confirm that the prefetch
buffers are
> > > empty
> > > > by
> > > > > > > >> inspecting the JMX MBeans on the broker.  Look
at the
> > consumers
> > > > for
> > > > > > the
> > > > > > > >> destination, and for each one look at its
> DispatchedQueueSize
> > > > > > attribute.
> > > > > > > >>
> > > > > > > >> Keep in mind that slow consumers are identified
*ONLY* if
> you
> > > > > > configure
> > > > > > > >> one
> > > > > > > >> of the abort strategies.  If you didn't set that
up, don't
> > > expect
> > > > > any
> > > > > > > slow
> > > > > > > >> consumer identification log lines.  And if you
did, I've
> never
> > > > seen
> > > > > a
> > > > > > > >> situation where a consumer went slow and a log
line didn't
> > > happen
> > > > > > (using
> > > > > > > >> the SlowConsumerAbortStrategy; I haven't used
> > > > > > > SlowAckConsumerAbortStrategy
> > > > > > > >> and can't vouch for it); we get those log lines
pretty
> > > frequently.
> > > > > So
> > > > > > > if
> > > > > > > >> you're not seeing broker-side log lines about
consumers
> being
> > > > > > identified
> > > > > > > >> as
> > > > > > > >> slow and then aborted, I'd bet it's simply not
happening.
> > > > > > > >>
> > > > > > > >> On Tue, Mar 10, 2015 at 7:10 PM, Kevin Burton
<
> > > burton@spinn3r.com
> > > > >
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > The broker.  I’ll assume the prefetch brokers
are empty.
> I’m
> > > > > looking
> > > > > > > >> into
> > > > > > > >> > debugging that now but I don’t have tools
to introspect.
> > > > > > > >> >
> > > > > > > >> > The broker has thousands of messages.
> > > > > > > >> >
> > > > > > > >> > I just confirmed that a restart DOES improve
the
> situation.
> > > > > > > >> >
> > > > > > > >> > It’s possible that they’re being marked
as slow consumers
> > but
> > > > not
> > > > > > > >> *logged*
> > > > > > > >> > as such so I’m trying to use JMX to dump
the sessions.
> > > > > > > >> >
> > > > > > > >> > On Tue, Mar 10, 2015 at 5:58 PM, Tim Bain
<
> > > > tbain@alumni.duke.edu>
> > > > > > > >> wrote:
> > > > > > > >> >
> > > > > > > >> > > Are the messages getting hung up in
the broker or in the
> > > > client?
> > > > > > > (Do
> > > > > > > >> the
> > > > > > > >> > > consumers have empty or full prefetch
buffers?)
> > > > > > > >> > >
> > > > > > > >> > > On Tue, Mar 10, 2015 at 6:47 PM, Kevin
Burton <
> > > > > burton@spinn3r.com
> > > > > > >
> > > > > > > >> > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > I’m still trying to track down
some issues with
> > ActiveMQ …
> > > > > > > >> > > >
> > > > > > > >> > > > One is that I have 5 ActiveMQ servers
now, and each
> one
> > > has
> > > > > > about
> > > > > > > >> 3000
> > > > > > > >> > > > messages pending.  So 15000 messages
in queues.
> > > > > > > >> > > >
> > > > > > > >> > > > These are non-persistent queues,
plenty of memory and
> > > plenty
> > > > > of
> > > > > > > CPU,
> > > > > > > >> > but
> > > > > > > >> > > > the workers are just blocked waiting
to receive work.
> > > > > > > >> > > >
> > > > > > > >> > > > I had a hypothesis that this could
be slow workers,
> but
> > > > after
> > > > > > > tuning
> > > > > > > >> > some
> > > > > > > >> > > > things I no longer receive any
errors about slow
> > workers.
> > > > > > > >> > > >
> > > > > > > >> > > > Restarting the daemons doesn’t
fix things either.
> > > Anything
> > > > > else
> > > > > > > it
> > > > > > > >> > could
> > > > > > > >> > > > be?  I’m a bit stumped unfortunately.
> > > > > > > >> > > >
> > > > > > > >> > > > --
> > > > > > > >> > > >
> > > > > > > >> > > > Founder/CEO Spinn3r.com
> > > > > > > >> > > > Location: *San Francisco, CA*
> > > > > > > >> > > > blog: http://burtonator.wordpress.com
> > > > > > > >> > > > … or check out my Google+ profile
> > > > > > > >> > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > >> > > > <http://spinn3r.com>
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> >
> > > > > > > >> > Founder/CEO Spinn3r.com
> > > > > > > >> > Location: *San Francisco, CA*
> > > > > > > >> > blog: http://burtonator.wordpress.com
> > > > > > > >> > … or check out my Google+ profile
> > > > > > > >> > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > >> > <http://spinn3r.com>
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Founder/CEO Spinn3r.com
> > > > > > > > Location: *San Francisco, CA*
> > > > > > > > blog: http://burtonator.wordpress.com
> > > > > > > > … or check out my Google+ profile
> > > > > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > > <http://spinn3r.com>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Founder/CEO Spinn3r.com
> > > > > > > Location: *San Francisco, CA*
> > > > > > > blog: http://burtonator.wordpress.com
> > > > > > > … or check out my Google+ profile
> > > > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > <http://spinn3r.com>
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Founder/CEO Spinn3r.com
> > > > > Location: *San Francisco, CA*
> > > > > blog: http://burtonator.wordpress.com
> > > > > … or check out my Google+ profile
> > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > <http://spinn3r.com>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Founder/CEO Spinn3r.com
> > > Location: *San Francisco, CA*
> > > blog: http://burtonator.wordpress.com
> > > … or check out my Google+ profile
> > > <https://plus.google.com/102718274791889610666/posts>
> > > <http://spinn3r.com>
> > >
> >
>
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message