activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: Async Writer Thread Shutdown error adding a message on 5.15.2 broker
Date Tue, 13 Feb 2018 06:40:35 GMT
Neon,

Since 5.15.3 was just released and you'd no longer need to compile from
source, would it be possible to upgrade to it to see if Gary's hunch about
the root cause of the problem was correct and the problem is fixed in
5.15.3?

Tim

On Tue, Jan 16, 2018 at 4:02 AM, Gary Tully <gary.tully@gmail.com> wrote:

> Hi TIm,
> the regression caused references to batches of serialised messages destined
> for the journal to be retained by the page cache, potentially leading to an
> OOM.
> The temp store used to cursor non persistent messages uses the same journal
> logic and would be effected in the same way.
> The trigger for my hunch was that it only ocurred on 5.12.x
>
> On Fri, 12 Jan 2018 at 13:38 Tim Bain <tbain@alumni.duke.edu> wrote:
>
> > Gary,
> >
> > Can you elaborate on what about this situation looks like that problem?
> > Because based on the information given so far, it looked to me like the
> OP
> > was simply overdriving the storage to which the temp files are written.
> >
> > Tim
> >
> > On Jan 12, 2018 4:04 AM, "Gary Tully" <gary.tully@gmail.com> wrote:
> >
> > > this looks like an instance of the regression from
> > > https://issues.apache.org/jira/browse/AMQ-6815
> > >
> > > On Wed, 10 Jan 2018 at 19:42 neon18 <neon18@nngo.net> wrote:
> > >
> > > > We run the broker with max heap of 4G and initial of 1G (-Xmx4G
> > -Xms1G).
> > > > We use non-persistent messages on these particular queues (3 of them
> in
> > > > this
> > > > test).
> > > > The number of messages sent to the broker in my last "flood gate"
> test
> > > was
> > > > around 40,000 (40k) in 5 minutes or about 8K msgs/min. After this
> flood
> > > of
> > > > messages, the producers send messages at a much much lower rate. I
> have
> > > > pretty much the factory default activemq.xml with
> > > > systemUsage/memoryUsage/percentOfJvmHeap=70 and queuePrefetch=20 on
> > > these 3
> > > > queues.
> > > >
> > > > So I have seen two different scenarios when lots of non-persistent
> > > messages
> > > > are put on queue.
> > > > 1. Async Writer Thread Shutdown errors (with no prior
> warnings/errors),
> > > > then
> > > > OutOfMemoryErrors
> > > > 2. INFO PListStore: ...tmp_storage initialized, ~10 seconds later
> WARN:
> > > > IOException: OutOfMemoryError: GC overhead limit exceeded ...
> ActiveMQ
> > > > Transport: tcp:..., then that repeats and other errors follow. Also
> > there
> > > > is
> > > > no warnings/errors prior to the tmp_storage init info log msg. FYI:
> the
> > > web
> > > > console was responsive until I saw the tmp_storage initialized
> (KahaDB)
> > > > INFO
> > > > msg (~4.5 minutes into my test), then it stops responding. The last
> > count
> > > > of
> > > > messages on queues via web console is ~30K msgs under ActiveMQ 5.15.2
> > > > broker. Under the 5.14.5 broker, I was able to see the flood of ~40K
> > msgs
> > > > added to the 3 queues in ~6 minutes.
> > > >
> > > > In more controlled testing in the past 2 days, where I clear the
> > AMQ_DATA
> > > > dir before each test run, I have not seen issue #1 (Async Writer
> Thread
> > > > Shutdown / OutOfMemoryError). I see issue #2 (OutOfMemoryError) a few
> > > > seconds after KahaDB tmp_storage is initialized, then the web console
> > > stop
> > > > responding and lots of OoMerrors and other errors in the
> activemq.log.
> > > >
> > > > Running with ActiveMQ 5.14.5 and 5.12.2 brokers, we do not have any
> > > > OutOfMemoryErrors with this same load or higher load vs running under
> > > > ActiveMQ 5.15.2. Running with 5.15.2 broker it seems like there might
> > be
> > > an
> > > > issue with throttling the producers of the queue when the JVM hit's
> the
> > > > configured memoryUsage (default of 70%).
> > > >
> > > > Running on that thought, I did another test with
> > > > systemUsage/memoryUsage/percentOfJvmHeap=50 %, but same thing
> (except
> > > that
> > > > the OoM error occurs about 20 seconds after the tmp_storage init info
> > > log.
> > > >
> > > > So, I ran the test again with systemUsage/memoryUsage to 20%, same
> > thing,
> > > > except the OoM error occurs about 40 seconds after the tmp_storage
> init
> > > > info
> > > > log. This time, I also monitored the memory percent used and temp
> > memory
> > > > percent used via the web console. right after I see the tmp_storage
> > init
> > > > info log, I can see memUsed=39% tempUsed=1%, ~10 seconds later
> > > memUsed=56%
> > > > tempUsed=2%, ~10 seconds later memUsed=69% tempUsed=2%, next refresh
> > > failed
> > > > and of course in the activemq.log I see the OutOfMemoryErrors and
> other
> > > > warnings and errors appearing in the log.
> > > >
> > > > Also, I grepped my old logs for "Journal failed" and did see some
> > > results,
> > > > but they happen after the a few OutOfMemoryErrors, so I did not
> include
> > > > them
> > > > in this thread.
> > > >
> > > > I can pretty reliably recreate the problem in about 6 minutes (with a
> > > clean
> > > > amq_data_dir) running ActiveMQ 5.15.2 broker and no issues under
> 5.14.5
> > > or
> > > > 5.12.2 brokers.
> > > >
> > > > Regards,
> > > >
> > > > Neon
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from:
> > > > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message