From users-return-22768-apmail-activemq-users-archive=activemq.apache.org@activemq.apache.org Mon Mar 08 16:55:59 2010 Return-Path: Delivered-To: apmail-activemq-users-archive@www.apache.org Received: (qmail 98445 invoked from network); 8 Mar 2010 16:55:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Mar 2010 16:55:59 -0000 Received: (qmail 49921 invoked by uid 500); 8 Mar 2010 16:55:35 -0000 Delivered-To: apmail-activemq-users-archive@activemq.apache.org Received: (qmail 49889 invoked by uid 500); 8 Mar 2010 16:55:35 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 49881 invoked by uid 99); 8 Mar 2010 16:55:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Mar 2010 16:55:35 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_HELO_PASS,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Mar 2010 16:55:31 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1NogEY-0000gn-QW for users@activemq.apache.org; Mon, 08 Mar 2010 08:55:10 -0800 Message-ID: <27824106.post@talk.nabble.com> Date: Mon, 8 Mar 2010 08:55:10 -0800 (PST) From: "Ted C." To: users@activemq.apache.org Subject: Re: NMS Failover transport pegging CPU In-Reply-To: <1268064238.2530.2.camel@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: tedc@verdiem.com References: <27763465.post@talk.nabble.com> <1267734735.2381.2.camel@localhost> <27788784.post@talk.nabble.com> <1268064238.2530.2.camel@localhost> Tim -- Yes, it is at least one problem. I've applied and tested a very small targeted fix that seems to work. The problem is that this code is effectively a state machine, which I haven't had the time to fully understand. Making changes to that type of code always makes me nervous because there's always more states than you realize at first. I will include my patch with the bug but I would probably recommend not taking it as a long-term solution. For the next few week, I'm swamped with trying to get a release out the door at work and finals at school. After that, I may have a chance to look into the failover code in more detail. Thanks, Ted C. Timothy Bish wrote: > > On Thu, 2010-03-04 at 17:03 -0800, Ted C. wrote: >> I'm happy to do so, will probably be tomorrow. Just as a note, I think >> that >> I'm getting this because in FailoverTransport there's the following if: >> >> if(ConnectedTransport != null || disposed || >> connectionFailure != null) >> { >> return false; >> } >> else >> >> >> >> it appears (as in I've seen this in a couple of iterations and haven't >> gotten back to it, yet) that connectionFailure is not null, so there's an >> immediate return false and the the loop spins. >> I >> Speaking of which, I'm not sure I see a way for connectionFailure to ever >> become null again. It appears that it's only assigned in the else part >> of >> the if above. Am I missing something? >> >> Thanks, > > Its quite possible that this is the problem. I've not had time yet to > test this. The FailoverTransport code is in need of a code review, so > its not surprising there's some issues in there. > > Regards > Tim. > > >> >> Ted C. >> >> >> >> Timothy Bish wrote: >> > >> > On Tue, 2010-03-02 at 17:27 -0800, Ted C. wrote: >> >> It appears that NMS is pegging the CPU. In my scenario, there's one >> >> broker >> >> running and the broker goes down. When that hapens, my CPU >> utilization >> >> goes >> >> to 100% and never recovers. >> >> >> >> When I break into the program, I see that FailoverTask.Iterate is >> getting >> >> called frequently. I ran it under dotTrace and got the following: >> >> >> >> 32.70 % Thread #105762776 - 14308 ms - 0 calls >> >> 32.70 % >> System.Threading._ThreadPoolWaitCallback.PerformWaitCallback... >> >> - >> >> 14308* ms - 0 calls >> >> 32.70 % Run - 14308* ms - 0 calls - >> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.Run(Object) >> >> 32.70 % RunTask - 14308* ms - 0 calls - >> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.RunTask() >> >> 32.70 % Iterate - 14308* ms - 0 calls - >> >> >> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.FailoverTask.Iterate() >> >> 23.59 % WaitOne - 10323* ms - 0 calls - >> >> System.Threading.WaitHandle.WaitOne() >> >> 8.22 % ReleaseMutex - 3597 ms - 0 calls - >> >> System.Threading.Mutex.ReleaseMutex() >> >> 0.67 % get_ConnectedTransport - 291 ms - 0 calls - >> >> >> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.get_ConnectedTransport() >> >> 0.22 % DoConnect - 97 ms - 0 calls - >> >> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.DoConnect() >> >> >> >> Anybody seen similar issues? This is ActiveMQ 5.3 and NMS 1.2.0. >> >> >> >> Thanks, >> >> >> >> Ted C. >> >> >> > >> > This isn't an issue that's been reported yet. Could you raise a new >> > Jira issue regarding this? I'd expect that the initial failure would >> > cause a spike in CPU but would expect that the reconnect delay would >> > cause that to settle down as it increases. >> > >> > Regards >> > >> > >> > -- >> > Tim Bish >> > >> > Open Source Integration: http://fusesource.com >> > ActiveMQ in Action: http://www.manning.com/snyder/ >> > >> > Follow me on Twitter: http://twitter.com/tabish121 >> > My Blog: http://timbish.blogspot.com/ >> > >> > >> > >> > > -- > Tim Bish > > Open Source Integration: http://fusesource.com > ActiveMQ in Action: http://www.manning.com/snyder/ > > Follow me on Twitter: http://twitter.com/tabish121 > My Blog: http://timbish.blogspot.com/ > > > -- View this message in context: http://old.nabble.com/NMS-Failover-transport-pegging-CPU-tp27763465p27824106.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.