activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted C." <t...@verdiem.com>
Subject Re: NMS Failover transport pegging CPU
Date Mon, 08 Mar 2010 16:55:10 GMT

Tim --

Yes, it is at least one problem.  I've applied and tested a very small
targeted fix that seems to work.  The problem is that this code is
effectively a state machine, which I haven't had the time to fully
understand.  Making changes to that type of code always makes me nervous
because there's always more states than you realize at first.

I will include my patch with the bug but I would probably recommend not
taking it as a long-term solution.

For the next few week, I'm swamped with trying to get a release out the door
at work and finals at school.  After that, I may have a chance to look into
the failover code in more detail.

Thanks,

Ted C.


Timothy Bish wrote:
> 
> On Thu, 2010-03-04 at 17:03 -0800, Ted C. wrote:
>> I'm happy to do so, will probably be tomorrow.  Just as a note, I think
>> that
>> I'm getting this because in FailoverTransport there's the following if:
>> 
>>                 if(ConnectedTransport != null || disposed ||
>> connectionFailure != null)
>>                 {
>>                     return false;
>>                 } 
>>                 else
>> 
>> 
>> 
>> it appears (as in I've seen this in a couple of iterations and haven't
>> gotten back to it, yet) that connectionFailure is not null, so there's an
>> immediate return false and the the loop spins.
>> I
>> Speaking of which, I'm not sure I see a way for connectionFailure to ever
>> become null again.  It appears that it's only assigned in the else part
>> of
>> the if above.  Am I missing something?
>> 
>> Thanks,
> 
> Its quite possible that this is the problem.  I've not had time yet to
> test this.  The FailoverTransport code is in need of a code review, so
> its not surprising there's some issues in there.  
> 
> Regards
> Tim.
> 
> 
>> 
>> Ted C.
>> 
>> 
>> 
>> Timothy Bish wrote:
>> > 
>> > On Tue, 2010-03-02 at 17:27 -0800, Ted C. wrote:
>> >> It appears that NMS is pegging the CPU.  In my scenario, there's one
>> >> broker
>> >> running and the broker goes down.  When that hapens, my CPU
>> utilization
>> >> goes
>> >> to 100% and never recovers.
>> >> 
>> >> When I break into the program, I see that FailoverTask.Iterate is
>> getting
>> >> called frequently.  I ran it under dotTrace and got the following:
>> >> 
>> >> 32.70 % Thread #105762776 - 14308 ms - 0 calls
>> >>   32.70 %
>> System.Threading._ThreadPoolWaitCallback.PerformWaitCallback...
>> >> -
>> >> 14308* ms - 0 calls
>> >>     32.70 % Run - 14308* ms - 0 calls -
>> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.Run(Object)
>> >>       32.70 % RunTask - 14308* ms - 0 calls -
>> >> Apache.NMS.ActiveMQ.Threads.PooledTaskRunner.RunTask()
>> >>         32.70 % Iterate - 14308* ms - 0 calls -
>> >>
>> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.FailoverTask.Iterate()
>> >>           23.59 % WaitOne - 10323* ms - 0 calls -
>> >> System.Threading.WaitHandle.WaitOne()
>> >>           8.22 % ReleaseMutex - 3597 ms - 0 calls -
>> >> System.Threading.Mutex.ReleaseMutex()
>> >>           0.67 % get_ConnectedTransport - 291 ms - 0 calls -
>> >>
>> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.get_ConnectedTransport()
>> >>           0.22 % DoConnect - 97 ms - 0 calls -
>> >> Apache.NMS.ActiveMQ.Transport.Failover.FailoverTransport.DoConnect()
>> >> 
>> >> Anybody seen similar issues?  This is ActiveMQ 5.3 and NMS 1.2.0.
>> >> 
>> >> Thanks,
>> >> 
>> >> Ted C.
>> >> 
>> > 
>> > This isn't an issue that's been reported yet.  Could you raise a new
>> > Jira issue regarding this?  I'd expect that the initial failure would
>> > cause a spike in CPU but would expect that the reconnect delay would
>> > cause that to settle down as it increases.
>> > 
>> > Regards
>> > 
>> > 
>> > -- 
>> > Tim Bish
>> > 
>> > Open Source Integration: http://fusesource.com
>> > ActiveMQ in Action: http://www.manning.com/snyder/
>> > 
>> > Follow me on Twitter: http://twitter.com/tabish121
>> > My Blog: http://timbish.blogspot.com/
>> > 
>> > 
>> > 
>> 
> 
> -- 
> Tim Bish
> 
> Open Source Integration: http://fusesource.com
> ActiveMQ in Action: http://www.manning.com/snyder/
> 
> Follow me on Twitter: http://twitter.com/tabish121
> My Blog: http://timbish.blogspot.com/
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/NMS-Failover-transport-pegging-CPU-tp27763465p27824106.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Mime
View raw message