activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darren Govoni <dar...@ontrenet.com>
Subject Re: AMQ halts and crashes after few thousand reqs
Date Wed, 27 Feb 2013 13:35:14 GMT
That's good to hear. We tried to use AMQ (and still want it to be a 
solution for us!) a couple years ago in our amazon cluster and using 
network of brokers found that consumers would fall off and not reconnect 
and on the producer side things would just come to a halt. We had to act 
quickly and didn't have time to do actual experiments unfortunately but 
it was with a cluster of 50 or so servers and a master unit.

All 50 servers had about 6-10 threads pushing messages on and pulling 
them off. It would go for a few thousand cycles and then the whole thing 
would halt.
After reading the OP's description about a recent AMQ and their tests it 
sounded very familiar to ours a couple years ago. I think we might have 
been using MySQL for persistence off/on as well.

Perhaps there's some configuration magic, backends, and other things 
that "can work", but out of the box we had trouble. Others have too.
I'm sure there's a reason the OP is having problems. Would love to 
understand what they are so we can come back to AMQ.
But the OP's documentation of the problem seems very well put together 
which leads us to think AMQ still has scalability issues.

Maybe you can share some details how you were able to get it to work at 
scale?

With Regards,
Darren


On 02/27/2013 04:41 AM, Gaurav Sharma wrote:
> Wish I could put up some of Hiram's benchmark ActiveMQ tests' perf stats up here from
even one of our dev clusters (forget about stress or prod env's) but company policy doesn't
permit I share. One of my prod clusters does apple push notifications among other things -
it has been up and invisible for more than a hundred days and delivered X hundreds of millions
of events during this time with flat system graphs and no nagios alerts (not even warnings)
- just a testament to how rock solid and reliable ActiveMQ is if you know what you are doing.
>
> And I should share my viewpoint since you talk bout Mongo.. we have tens of TB's of data
in Mongo and no, we do not love to operate it at scale. Distributed queueing is hard - if
possible, push distribution down a tier or two to persistent/disk storage and minimize opportunities
to have to deal with consensus protocols. I have nothing but great things to say about ActiveMQ.
Yes, it can be sped up even further if you strip it down and do append-only formats like kafka
but then, forget about all the cool features and implementing jms specs, etc - there are many
trade-offs to be made and Apollo seems headed in that direction (Dejan: please correct me
if I mis-stated this).
>
> When you say, you "failed to get it to scale", can you share some specifics, so some
of us on here can help? What were your targets and how did you fail?
>
> Mandar: 1gig heap is not very much unless you can tune the occupancy fraction appropriate
for your workload type besides other things. Also, when you mention, "linux with quadcore
+ 8GB RAM", is that a server grade machine that you use or your laptop? I have trouble getting
anything lesser than 48g 1/2U's because now I need to play games with the collectors running
out of steam at 16-18g heaps.
>
>
> On Feb 26, 2013, at 19:09, Darren Govoni <darren@ontrenet.com> wrote:
>
> Unfortunately, getting AMQ to show production-grade scalability and reliability is a
real challenge.
> We tried and failed to get it to scale or perform acceptably and were forced to write
our own distributed queue on top of mongodb.
>
> The addition of AMQP is nice however.
>
> On 02/15/2013 05:44 AM, mandar.wanpal wrote:
>> Hi All,
>>
>> We are seeing some serious issue with AMQ in few of our load tests.
>>
>> We have configured our AMQ with below configs.
>>
>> Heap size increased 1GB
>> JMX port opened for AMQ.
>> jms.prefetchPolicy.all=10
>> constantPendingMessageLimitStrategy=50.
>> -XX:Permsize=128m -XX:MaxPermsize=128m
>>
>> We have AMQ with KahaDB in simple failover settings, so if AMQ1 fails, AMQ2
>> takes over. Messages are laso not huge in size.
>>
>> Observations:
>> 1. If Heapsize set to 512 mb, AMQ fails after some 7500 reqs and switches to
>> AMQ2. AMQ 2 also not able to continue because producer is not able to
>> initiate proper communication with AMQ2 because of may be the backlog of
>> messages that AMQ1 didnt accept.
>> 2. If Heapsize set to 1g, AMQ fails after some 15000 reqs and switches to
>> AMQ2. AMQ 2 also not able to continue because producer is not able to
>> initiate proper communication with AMQ2 because of may be the backlog of
>> messages that AMQ1 didnt accept.
>> 3. When AMQ fails, we start getting OutOfMemory errors and AMQ starts doing
>> FullGC continuously. As FullGC halts JVM, AMQ halts and cant do anything.
>> 4. After seeing so many FullGC, we took heapdump and analysed it with
>> Eclipse. PFA report which states few suspected areas which are causing leak
>> in AMQ.
>>
>> AMQ_Leak_Report.pdf
>> <http://activemq.2283324.n4.nabble.com/file/n4663532/AMQ_Leak_Report.pdf>
>>
>> 5. Frequency of FullGC increases with time and amount of memory they can
>> reclaim gets reduced.
>>
>> Queries:
>> What can be the ideal config for AMQ to atleast process upto 1lac reqs
>> without requiring a restart and without setting heap to some gigantic size.
>> I have linux with quadcore + 8GB RAM.
>>
>>
>>
>> -----
>> Regards,
>> Mandar Wanpal
>> Email: mandar.wanpal@gmail.com
>> --
>> View this message in context: http://activemq.2283324.n4.nabble.com/AMQ-halts-and-crashes-after-few-thousand-reqs-tp4663532.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Mime
View raw message