Return-Path: X-Original-To: apmail-activemq-users-archive@www.apache.org Delivered-To: apmail-activemq-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2FFA6185DE for ; Fri, 24 Apr 2015 21:30:02 +0000 (UTC) Received: (qmail 87233 invoked by uid 500); 24 Apr 2015 21:30:01 -0000 Delivered-To: apmail-activemq-users-archive@activemq.apache.org Received: (qmail 87194 invoked by uid 500); 24 Apr 2015 21:30:01 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 87182 invoked by uid 99); 24 Apr 2015 21:30:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Apr 2015 21:30:01 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: message received from 54.191.145.13 which is an MX secondary for users@activemq.apache.org) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Apr 2015 21:29:55 +0000 Received: from mail-ig0-f174.google.com (mail-ig0-f174.google.com [209.85.213.174]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 7A55C24CC2 for ; Fri, 24 Apr 2015 21:29:35 +0000 (UTC) Received: by igblo3 with SMTP id lo3so25128714igb.1 for ; Fri, 24 Apr 2015 14:28:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=vvZo9eLCzMYNQYYA3B3xgl/kS0Rs8KJqMddaXQgaUX0=; b=BkKi476Qf7NAbLwvs3RtkDFBUXoeMcx8DP+bgiWeKhfeeiKf2c4T3pxPMD2LCY3ONE 5T/b9G2N+jVzMpaqkOVupu6qPylyICtwTx91t9+W9TTV6dNKa68bVc2bfa58S5kusZ1G 80p4AN/E3lGGstcEAHee7bgzNchER60xbzDkxmCuKUskklggW13CuEE9n0XOLtoyiYTJ ps98ZU/zIQXfF+qeOPAahP512zQF/g9y+wFGBDAhX8uOniOrO14qHWTJb8rEfsvPYVSD WszODMUZ3rAPy5Dd03CxyEnA+W1vlOt5sJSVR+4JbZhLt1JRYbV0kw/a9AZxhNiyuM0N VlMQ== X-Received: by 10.50.20.233 with SMTP id q9mr5207379ige.9.1429910884876; Fri, 24 Apr 2015 14:28:04 -0700 (PDT) MIME-Version: 1.0 Sender: tbain98@gmail.com Received: by 10.50.131.233 with HTTP; Fri, 24 Apr 2015 14:27:44 -0700 (PDT) In-Reply-To: <553AAFAE.2050606@gmail.com> References: <553AAFAE.2050606@gmail.com> From: Tim Bain Date: Fri, 24 Apr 2015 15:27:44 -0600 X-Google-Sender-Auth: 7fU6kjxdYQLFd62X5B40I3VUG6k Message-ID: Subject: =?UTF-8?Q?Re=3A_I_can_finally_somewhat_duplicate_the_bad_bug_I?= =?UTF-8?Q?=E2=80=99m_seeing_with_ActiveMQ_not_dispatching_messages=2E?= To: ActiveMQ Users Content-Type: multipart/alternative; boundary=047d7bd6bb5a5a873305147f1234 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd6bb5a5a873305147f1234 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable If every message has at least one consumer for which the consumer's selector matches the message, you'll eventually process every message. Consumers that have no messages matching their selector in the cursor will be delayed until the messages in front of their next one get consumed, but they'll do it eventually; I don't think you'd have a complete failure to process messages as Kevin described. (Or maybe I'm reading the wrong thing into Kevin's description. Kevin, can you confirm that you're getting NO messages to ANY consumer on your queue?) Kevin, your screenshots didn't come through in my email client (Gmail) nor on the Nabble page; can you resend so we can see what you're seeing in JMX? Also, given the way cursors work, implementing priority using selectors is never going to work. At best you'll only be able to prioritize among the first N messages in the store at any point in time (with N =3D the number o= f messages that will fit into the cursor), which will eventually result in you having only N lowest-priority messages so you'll process the low-priority messages while your high-priority consumers sit unable to reach the high-priority messages deeper in the store. If you want to use selectors to implement priority, you're going to have to implement the enhancements to cursors that Jon and I were talking about on Wednesday. Tim On Fri, Apr 24, 2015 at 3:03 PM, Timothy Bish wrote: > On 04/24/2015 04:50 PM, Kevin Burton wrote: > > I=E2=80=99ve been working 15 hour days for the last 2-3 weeks trying to= resolve > > this so if this is somewhat incoherent it=E2=80=99s probably due to lac= k of sleep > > :-P > > > > I think we=E2=80=99re experiencing a bug in ActiveMQ which is VERY hard= to > > reproduce but happens regularly in our production setup. > > > > I can=E2=80=99t reproduce it in my test setup because it seems to requi= re real > > world data. Every time I try to do so everything works fine. > > > > It seems you have to have the following: > > > > - a large number of queues which need servicing ( > 1000) > > - a fairly large number of connections (>2000) > > - message selectors > > - a queue that has a large number of messages (5000). > > > > I have my test code now reproducing it=E2=80=A6 > > > > Everything works FINE if we have just a few message. The problems aris= e > > once the queue size grows at which point selectors don=E2=80=99t work. > > > > It seems like *early* connections win. If I create a connection to > > ActiveMQ early, and keep it open, it will work. But new connections don= =E2=80=99t > > work.. Eventually, the existing connections will fail too. > > > > Basically, it works JUST FINE without message selectors. > > > > I KNOW it=E2=80=99s not my code because I=E2=80=99ve written a basic /s= imple consumer > which > > is literally just raw JMS and is < 50 lines of code. > > > > I also know my messages selectors should match. First. they do match > some > > percentage of the time. Second, when I consume without the message > > selectors, it works. I have it print the message headers and I can > confirm > > that they should match. > > > > This also seems to get worse over time. The larger the queue, the less > > chance messages will be serviced, eventually it will just lock up > entirely. > > > > > > There are no obvious errors in the ActiveMQ log. Just regarding queue > GC. > > > > The box still has about 40% memory free. So I don=E2=80=99t think it h= as any > issue > > with memory. No OutOfMemoryErrors being logged. > > > > I think another way to debug this could be to restart activemq itself > with > > message tracing. Then try to get the queue to this state again, and try > to > > consume messages nd see what=E2=80=99s being logged while it=E2=80=99s = failing. > > > > What=E2=80=99s frustrating here is that this is the 3rd ActiveMQ workar= ound I=E2=80=99ve > > had to implement. > > > > the first was because LevelDB was very slow=E2=80=A6 (artificially slow= it > seems), > > so then I decided to just use the memory store. But the memory store > > doesn=E2=80=99t support priority, so instead, I implemented priority th= rough JMS > > selectors. But now JMS selectors don=E2=80=99t work. > > > > :-/ > > > This sounds a lot like the standard issue of having a deep queue and the > message selector not being able to match because the maxPageSize value > is limiting what the message cursor will page in. Have you tried upping > the maxPageSize option? See: > https://issues.apache.org/jira/browse/AMQ-2217 > > -- > Tim Bish > Sr Software Engineer | RedHat Inc. > tim.bish@redhat.com | www.redhat.com > twitter: @tabish121 > blog: http://timbish.blogspot.com/ > > --047d7bd6bb5a5a873305147f1234--