activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject Re: I can finally somewhat duplicate the bad bug I’m seeing with ActiveMQ not dispatching messages.
Date Fri, 24 Apr 2015 21:47:25 GMT
OK.. but what I don’t understand is that I have at least ONE consumer that
matches. So that one should keep being served (though with imperfect
priority)

Also, as a test, I shut down ALL consumers on the broker, then just had ONE
consumer, and the selector was

artemis_priority <= 9

… and this should have matched ANYTHING.  But it didn’t work.



On Fri, Apr 24, 2015 at 2:30 PM, Tim Bain <tbain@alumni.duke.edu> wrote:

> Keep in mind that a pause as described by that JIRA could come about
> because your consumer has a full prefetch buffer worth of messages that
> match the selector plus lots more messages in the store.  If you have a
> backlog for any consumer, anything that can't fit in the consumer's
> prefetch buffer will hang out in the cursor and eventually the message
> store (outside, and blocked by, the cursor).  It's not necessary to have
> messages that fail to match any selector, though that will certainly
> produce the behavior too.
>
> Tim
>
> On Fri, Apr 24, 2015 at 3:21 PM, Kevin Burton <burton@spinn3r.com> wrote:
>
> > Literally JUST found this issue!
> >
> > Is this documented anywhere? My issue is that there *is* no sparse
> message
> > distribution.  Every message has a value from between 0 and 9 with none
> > lacking that header.
> >
> > I even consume where the message is lacking the value.
> >
> > So there shouldn’t be anything left over.
> >
> > I think ActiveMQ should probably log an error when this happens.
> >
> > On Fri, Apr 24, 2015 at 2:03 PM, Timothy Bish <tabish121@gmail.com>
> wrote:
> >
> > > On 04/24/2015 04:50 PM, Kevin Burton wrote:
> > > > I’ve been working 15 hour days for the last 2-3 weeks trying to
> resolve
> > > > this so if this is somewhat incoherent it’s probably due to lack of
> > sleep
> > > > :-P
> > > >
> > > > I think we’re experiencing a bug in ActiveMQ which is VERY hard to
> > > > reproduce but happens regularly in our production setup.
> > > >
> > > > I can’t reproduce it in my test setup because it seems to require
> real
> > > > world data.  Every time I try to do so everything works fine.
> > > >
> > > > It seems you have to have the following:
> > > >
> > > > - a large number of queues which need servicing ( > 1000)
> > > > - a fairly large number of connections (>2000)
> > > > - message selectors
> > > > - a queue that has a large number of messages (5000).
> > > >
> > > > I have my test code now reproducing it…
> > > >
> > > > Everything works FINE if we have just a few message.  The problems
> > arise
> > > > once the queue size grows at which point selectors don’t work.
> > > >
> > > > It seems like *early* connections win.  If I create a connection to
> > > > ActiveMQ early, and keep it open, it will work. But new connections
> > don’t
> > > > work..  Eventually, the existing connections will fail too.
> > > >
> > > > Basically, it works JUST FINE without message selectors.
> > > >
> > > > I KNOW it’s not my code because I’ve written a basic /simple consumer
> > > which
> > > > is literally just raw JMS and is < 50 lines of code.
> > > >
> > > > I also know my messages selectors should match.  First.  they do
> match
> > > some
> > > > percentage of the time. Second, when I consume without the message
> > > > selectors, it works.  I have it print the message headers and I can
> > > confirm
> > > > that they should match.
> > > >
> > > > This also seems to get worse over time.  The larger the queue, the
> less
> > > > chance messages will be serviced, eventually it will just lock up
> > > entirely.
> > > >
> > > >
> > > > There are no obvious errors in the ActiveMQ log.  Just regarding
> queue
> > > GC.
> > > >
> > > > The box still has about 40% memory free.  So I don’t think it has any
> > > issue
> > > > with memory.  No OutOfMemoryErrors being logged.
> > > >
> > > > I think another way to debug this could be to restart activemq itself
> > > with
> > > > message tracing. Then try to get the queue to this state again, and
> try
> > > to
> > > > consume messages nd see what’s being logged while it’s failing.
> > > >
> > > > What’s frustrating here is that this is the 3rd ActiveMQ workaround
> > I’ve
> > > > had to implement.
> > > >
> > > > the first was because LevelDB was very slow… (artificially slow it
> > > seems),
> > > > so then I decided to just use the memory store.  But the memory store
> > > > doesn’t support priority, so instead, I implemented priority through
> > JMS
> > > > selectors.  But now JMS selectors don’t work.
> > > >
> > > > :-/
> > > >
> > > This sounds a lot like the standard issue of having a deep queue and
> the
> > > message selector not being able to match because the maxPageSize value
> > > is limiting what the message cursor will page in.  Have you tried
> upping
> > > the maxPageSize option?  See:
> > > https://issues.apache.org/jira/browse/AMQ-2217
> > >
> > > --
> > > Tim Bish
> > > Sr Software Engineer | RedHat Inc.
> > > tim.bish@redhat.com | www.redhat.com
> > > twitter: @tabish121
> > > blog: http://timbish.blogspot.com/
> > >
> > >
> >
> >
> > --
> >
> > Founder/CEO Spinn3r.com
> > Location: *San Francisco, CA*
> > blog: http://burtonator.wordpress.com
> > … or check out my Google+ profile
> > <https://plus.google.com/102718274791889610666/posts>
> > <http://spinn3r.com>
> >
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message