activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: Producer Flow Control active but server still facing OOM issues
Date Fri, 16 Feb 2018 23:13:27 GMT
512 MB isn't very much memory for an ActiveMQ broker. I'm used to seeing
more like 2 GB, 4 GB, sometimes more when people describe how big their
heap is. If 512 MB works with Postgres, and you're good with using
Postgres, that's fine, but the general consensus is that KahaDB has had
more testing and is more performant than the SQL data store, so I
personally would run KahaDB even if it meant I had to use a heap larger
than 512 MB. So you might consider testing whether your scenario works with
KahaDB with a larger heap, and if so whether you want to use Postgres or
KahaDB.

Tim

On Feb 16, 2018 7:49 AM, "Thiago Veronezi" <thiago@veronezi.org> wrote:

Hi Tim,

Thanks for your time. I managed to make the Broker very stable after moving
away from KahaDB and Limiting the number of active connections at one time.
This is what I think happened.

* KahaDB competes with the broker for JVM resources. It does not matter how
much of memory I reserve with "-Xmx", KahaDB eats it all up when dealing
with the 1.000.000 messages.
* My client code was wrong. I was creating connections in parallel like
crazy.

Just after sending the last message, I've got a OOM exception, even with
the sendFailIfNoSpace activated. The "maximumConnections=1000" was too much
for my 512MB, plus KahaDB was using almost all of it.

I need to check again the case where the memory goes back to normal but
activemq keeps denying new connections for some time. It was probably
something wrong with my client code. I will check this out as soon as I
can. I will post the news here.

The good news is that the AMQ+Postgres combination handles the 1.000.000
persistent messages like a charm.

[]s,
Thiago.



On Fri, Feb 16, 2018 at 12:12 PM, Tim Bain <tbain@alumni.duke.edu> wrote:

> This is only a partial answer (I'll try to get time this weekend to answer
> the parts I don't have time for now), but I want to get you something to
> start with.
>
> On Feb 15, 2018 5:03 AM, "Thiago Veronezi" <thiago@veronezi.org> wrote:
>
> Hi, ActiveMQ community,
>
> I'm actively working on a documentation for "out of memory" protection on
> ActiveMQ. Recently I was working on this POC project where I stressed a
> default broker configuration with 1.000.000 messages with 20KB payload
> each, where each message took 1 second to be consumed. It caused the
> "Pending Messages" numbers go up pretty fast.
>
>
> Are these persistent or non-persistent messages? How large (capacity) is
> your persistent store and your temp store?
>
> My understanding is that AMQ, out of the box, has the "Producer Flow
> Control" feature activated for all Topics and Queues; and it has
> "usedMemory" threshold set as 70% of 512MB.
>
>
> Did PFC kick in? You'd see it in the broker's logs.
>
> Still, with the load I used, I
> saw OOM issues. The 1.000.000 messages actually killed the server.
>
> In my tests, I use several threads and nodes to send all the 1.000.000
> messages in parallel. That means I have several connections to the broker.
> Once I used the sendFailIfNoSpace="true" option, the OOM issues ceased;
The
> consumers were able to catch up, And the broker survived. One thing that I
> noticed is that even when the "Pending messages" number reached 0, it took
> some time for the server to allow new producer connections again.
>
>
> When it didn't allow new producer connections, what was the symptom?
>
> Questions:
>
> * Is it possible that AMQ doesn't count the memory used by each active
> connection as variable to the final used memory calculation?
>
>
> Yes. Those limits are solely on the memory message store (used for
> non-persistent messages and for paging in persistent messages from the
> persistent store), so it's possible to OOM even though you don't exceed
> those limits.
>
> * Is there any configuration where we set a refresh rate so the server
> notices faster when the memory is below the maximum threshold again?
>
>
> To the best of my knowledge, the metrics are captured instantaneously by
> modifying an object in memory, not via a periodic poll, so I think
> something else is going on. I'll come back to this.
>
> * Is the use of sendFailIfNoSpace="true" the ultimate solution for OOM
> issues? Is this something I can advise a customer to use so he is 99.9%
> guaranteed to not have OOM crashes?
>
>
> No. SendFailIfNoSpace just means that the client won't wait forever on a
> send. The only reason you're not seeing OOMs when you used it is because
> you're not retrying when you catch it.
>
> Thanks,
> Thiago.
>
> Ps.: I think this is my first message here. :)
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message