activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilkka Virolainen <Ilkka.Virolai...@bitwise.fi>
Subject RE: Artemis 2.4.0 - Issues with memory leaks and JMS message redistribution
Date Thu, 01 Mar 2018 14:59:14 GMT
An update on this: I have replicated the memory and expiration issues on current 2.5.0-SNAPSHOT
with included client libraries and a one node broker by modifying an existing artemis example.
As messages are routed to DLQ, paged and expired, memory consumption keeps increasing and
eventually leads to heap space exhaustion rendering the broker unable to route messages. What
should happen is that the memory consumption should stay reasonable even without expiration
due to paging to disk but doubly so because having expired, the messages shouldn't consume
any resources.

I'm not certain if the two issues (erroneous statistics on expiration and the memory leak)
are connected but they both do appear at the same time raising suspicion. A possible cause
could be that filtered message expiration behaves differently than some other means of expiration:
it uses a private expiration method that takes a transaction as a parameter. Unlike the nontransacted
expiration method, it checks for empty bindings separately but doesn't seem to decrement counters
appropriately in this case. Even though I have set a null expiry-address (<expiry-address
/>) it is seen as nonnull in expiration. Then as the expiry address is not null but bindings
are not found, the warning about dropping the message is logged. However, it seems that the
message is never acknowledged and the deliveringCount is never decreased so delivery metrics
end up being wrong.

Shouldn't there be an acknowledgment of the message reference following the logging when the
following condition is matched?
https://github.com/apache/activemq-artemis/blob/master/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/QueueImpl.java#L2735

Also, why is the acknowledgment reason here not expiry but normal? One would imagine it should
be acknowledge(tx, ref, AckReason.EXPIRED) instead of the default overload so that the appropriate
counters end up being incremented:
https://github.com/apache/activemq-artemis/blob/master/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/QueueImpl.java#L2747

Best regards,
- Ilkka

-----Original Message-----
From: Ilkka Virolainen [mailto:Ilkka.Virolainen@bitwise.fi] 
Sent: 27. helmikuuta 2018 15:20
To: users@activemq.apache.org
Subject: RE: Artemis 2.4.0 - Issues with memory leaks and JMS message redistribution

Hello,

- I don't have consumers on the DLQ and neither are any listed in its JMX attributes
- The messages are being sent to the DLQ by the broker after a delivery failure on another
queue. The delivery failure is expected and caused by a transactional rollback on the consumer.

- I am setting the expiry delay on the broker's DLQ address-settings (not in message attributes).
I'm setting an empty expiry-address in the same place.
- I have a set of broker settings and a small springboot application with which I was able
to replicate the issue. Would you like me to provide it for you somehow?

It seems like there's a some kind of hiccup in message expiration. When the messages routed
to the DLQ start expiring, the broker logs:

AMQ222146: Message has expired. No bindings for Expiry Address  so dropping it

but when reviewing the DLQ statistics via JMX the ExpiredMessages counter is not incremented,
but the DeliveringCount is. As messages keep expiring the deliverincount keeps increasing.
This feels a lot like the issue I've been having. Could it be that this process leaks memory/resources
or is it just that the expiration statistics always assume that expiration results in redelivery
thereby causing erroneous numbers to be reported? 

Best regards,
- Ilkka


-----Original Message-----
From: Justin Bertram [mailto:jbertram@apache.org]
Sent: 23. helmikuuta 2018 16:51
To: users@activemq.apache.org
Subject: Re: Artemis 2.4.0 - Issues with memory leaks and JMS message redistribution

Couple of questions:

 - Do you have any consumers on the DLQ?
 - Are messages being sent to the DLQ by the broker automatically (e.g.
based on delivery attempt failures) or is that being done by your application?
 - How are you setting the expiry delay?
 - Do you have a reproducible test-case?


Justin

On Fri, Feb 23, 2018 at 4:38 AM, Ilkka Virolainen < Ilkka.Virolainen@bitwise.fi> wrote:

> I'm still facing an issue with somewhat confusing behavior regarding 
> message expiration in the DLQ, maybe related to the memory issues I've 
> been having. My aim is to have messages routed to DLQ expire and 
> dropped in one hour. To achieve this, I've set an empty expiry-address 
> and the appropriate expiry-delay. The problem is, most of the messages 
> routed to DLQ end up in an in-delivery state - they are not expiring 
> and I cannot remove them via JMX. Messagecount in the DLQ is slightly 
> higher than the deliveringcount and attempting to remove all messages 
> only removes a number of messages that is equal to the difference 
> between deliveringcount and messagecount which is approximately a few 
> thousand messages while the messagecount is tens of thousands and increasing as message
delivery failures occur.
>
> What could be the reason for this behavior and how could it be avoided?
>
> -----Original Message-----
> From: Ilkka Virolainen [mailto:Ilkka.Virolainen@bitwise.fi]
> Sent: 22. helmikuuta 2018 13:38
> To: users@activemq.apache.org
> Subject: RE: Artemis 2.4.0 - Issues with memory leaks and JMS message 
> redistribution
>
> To answer my own question in case anyone else is wondering about a 
> similar issue, turns out the change in addressing is referred in 
> ticket [1] and adding the multicastPrefix and anycastPrefix described 
> in the ticket to my broker acceptors seems to have fixed my problem.
> If the issue regarding memory leaks persists I will try to provide a reproducible test
case.
>
> Thank you for your help, Justin.
>
> Best regards,
> - Ilkka
>
> [1] https://issues.apache.org/jira/browse/ARTEMIS-1644
>
>
> -----Original Message-----
> From: Ilkka Virolainen [mailto:Ilkka.Virolainen@bitwise.fi]
> Sent: 22. helmikuuta 2018 12:33
> To: users@activemq.apache.org
> Subject: RE: Artemis 2.4.0 - Issues with memory leaks and JMS message 
> redistribution
>
> Having removed the address configuration and having switched from
> 2.4.0 to yesterday's snapshot of 2.5.0 it seems like the 
> redistribution of messages is now working, but there also seems to 
> have been a change in addressing between the versions causing another 
> problem related to jms.queue / jms.topic prefixing. While the NMS 
> clients listen and artemis jms clients send to the same topics as 
> described in the previous message, Artemis 2.5.0 prefixes the 
> addresses with jms.topic. While the messages are being sent to e.g.
> A.B.f64dd592-a8fb-442e-826d-927834d566f4.C.D they are only received if 
> I explicitly prefix the listening address with jms.topic, for example 
> topic://jms.topic.A.B.*.C.D. Can this somehow be avoided in the broker configuration?
>
> Best regards
>
> -----Original Message-----
> From: Justin Bertram [mailto:jbertram@apache.org]
> Sent: 21. helmikuuta 2018 15:19
> To: users@activemq.apache.org
> Subject: Re: Artemis 2.4.0 - Issues with memory leaks and JMS message 
> redistribution
>
> Your first issue is probably a misconfiguration.  Your 
> cluster-connection is using an "address" value of '*' which I assume 
> is supposed to mean "all addresses," but the "address" element doesn't support wildcards
like this.
> Just leave it empty to match all addresses.  See the documentation [1] 
> for more details.
>
> Even after you fix that configuration issue you may run into issues.
> These may be fixed already via ARTEMIS-1523 and/or ARTEMIS-1680.  If 
> you have a reproducible test-case then you can verify using the head 
> of the master branch.
>
> For the memory issue it would be helpful to have some heap dumps or 
> something to actually see what's actually consuming the memory.
> Better yet would be a reproducible test-case.  Do you have either?
>
>
> Justin
>
> [1] https://activemq.apache.org/artemis/docs/latest/clusters.html
>
>
>
> On Wed, Feb 21, 2018 at 5:39 AM, Ilkka Virolainen < 
> Ilkka.Virolainen@bitwise.fi> wrote:
>
> > Hello,
> >
> > I am using Artemis 2.4.0 to broker messages through JMS 
> > queues/topics between a set of clients. Some are Apache NMS 1.7.2 
> > ActiveMQ clients and others are using Artemis JMS client 1.5.4 
> > included in Spring Boot
> 1.5.3.
> > Broker topology is a symmetric cluster of two live nodes with static 
> > connectors, both nodes having been setup as replicating colocated 
> > backup pairs with scale down. I have two quite frustrating issues at 
> > the
> moment:
> > message redistribution not working correctly and a memory leak 
> > causing eventual thread death.
> >
> > ISSUE #1 - Message redistribution / load balancing not working:
> >
> > Client 1 (NMS) connects to broker a and starts listening, artemis 
> > creates the following address:
> >
> > (Broker a):
> > A.B.*.C.D
> > |-queues
> > |-multicast
> >   |-f64dd592-a8fb-442e-826d-927834d566f4
> >
> > Server 1 (artemis-jms-client) connects to broker b and sends a 
> > message to
> > topic: A.B.f64dd592-a8fb-442e-826d-927834d566f4.C.D - this should be 
> > routed to broker a since the corresponding queue has no consumers on 
> > broker b (the queue does not exist). This however does not happen 
> > and the client receives no messages. Broker b has some other clients 
> > connected, causing similar (but not the same) queues having been created:
> >
> > (Broker b):
> > A.B.*.C.D
> > |-queues
> > |-multicast
> >   |-1eb48079-7fd8-40e9-b822-bcc25695ced0
> >   |-9f295257-c352-4ae6-b74b-d5994f330485
> >
> >
> > ISSUE #2: - Memory leak and eventual thread death
> >
> > Artemis broker has 4GB allocated heap space and global-max-size is 
> > set up as half of that (being the default setting). 
> > Address-full-policy is set to PAGE for all addresses and some 
> > individual addresses have small max-size-bytes values set e.g. 
> > 104857600. As far as I know the paging settings should limit memory 
> > usage but what happens is that at times Artemis uses the whole heap 
> > space, encounters an out of memory error and
> > dies:
> >
> > 05:39:29,510 WARN  [org.eclipse.jetty.util.thread.QueuedThreadPool] :
> > java.lang.OutOfMemoryError: Java heap space
> > 05:39:16,646 WARN  [io.netty.channel.ChannelInitializer] Failed to 
> > initialize a channel. Closing: [id: ...]: java.lang.OutOfMemoryError:
> > Java heap space
> > 05:41:05,597 WARN  [org.eclipse.jetty.util.thread.QueuedThreadPool]
> > Unexpected thread death: org.eclipse.jetty.util.thread.
> > QueuedThreadPool$2@5ffaba31 in 
> > qtp20111564{STARTED,8<=8<=200,i=2,q=0}
> >
> > Are these known issues in Artemis or misconfigurations in the brokers?
> >
> > The broker configurations are as follows. Broker b has an identical 
> > configuration excluding that the cluster connector's connector-ref 
> > and static-connector connector-ref refer to broker b and broker a
> respectively.
> >
> > Best regards,
> >
> > broker.xml (broker a):
> >
> > <?xml version='1.0'?>
> > <configuration xmlns="urn:activemq" xmlns:xsi="http://www.w3.org/ 
> > 2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq 
> > /schema/artemis-configuration.xsd">
> >     <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/ 
> > 2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq:core ">
> >         <name>[broker-a-ip]</name>
> >         <persistence-enabled>true</persistence-enabled>
> >
> >         <journal-type>NIO</journal-type>
> >
> >         <paging-directory>...</paging-directory>
> >         <bindings-directory>...</bindings-directory>
> >         <journal-directory>...</journal-directory>
> >         <large-messages-directory>...</large-messages-directory>
> >
> >         <journal-datasync>true</journal-datasync>
> >         <journal-min-files>2</journal-min-files>
> >         <journal-pool-files>-1</journal-pool-files>
> >         <journal-buffer-timeout>788000</journal-buffer-timeout>
> >         <disk-scan-period>5000</disk-scan-period>
> >
> >         <max-disk-usage>97</max-disk-usage>
> >
> >         <critical-analyzer>true</critical-analyzer>
> >         <critical-analyzer-timeout>120000</critical-analyzer-timeout>
> >         <critical-analyzer-check-period>60000</critical-
> > analyzer-check-period>
> >         <critical-analyzer-policy>HALT</critical-analyzer-policy>
> >
> >         <acceptors>
> >             <acceptor name="invm-acceptor">vm://0</acceptor>
> >             <acceptor name="artemis">tcp://0.0.0.0:61616</acceptor>
> >             <acceptor 
> > name="ssl">tcp://0.0.0.0:61617?sslEnabled=true;
> > keyStorePath=...;keyStorePassword=...</acceptor>
> >         </acceptors>
> >         <connectors>
> >             <connector name="invm-connector">vm://0</connector>
> >             <connector name="netty-connector">tcp://[ 
> > broker-a-ip]:61616</connector>
> >             <connector name="broker-b-connector">[ 
> > broker-b-ip]:61616</connector>
> >         </connectors>
> >
> >         <cluster-connections>
> >             <cluster-connection name="cluster-name">
> >                 <address>*</address>
> >                 <connector-ref>netty-connector</connector-ref>
> >                 <retry-interval>500</retry-interval>
> >                 <reconnect-attempts>5</reconnect-attempts>
> >                 <use-duplicate-detection>true</use-duplicate-detection>
> >                 <message-load-balancing>ON_DEMAND</message-load-
> balancing>
> >                 <max-hops>1</max-hops>
> >                 <static-connectors>
> >                     <connector-ref>broker-b-connector</connector-ref>
> >                 </static-connectors>
> >             </cluster-connection>
> >         </cluster-connections>
> >
> >         <ha-policy>
> >             <replication>
> >                 <colocated>
> >
> > <backup-request-retry-interval>5000</backup-request-
> > retry-interval>
> >                     <max-backups>3</max-backups>
> >                     <request-backup>true</request-backup>
> >                     <backup-port-offset>100</backup-port-offset>
> >                     <excludes>
> >                         <connector-ref>invm-connector</connector-ref>
> >                         <connector-ref>netty-connector</connector-ref>
> >                     </excludes>
> >                     <master>
> >                         <check-for-live-server>true</
> > check-for-live-server>
> >                     </master>
> >                     <slave>
> >                         <restart-backup>false</restart-backup>
> >                         <scale-down />
> >                     </slave>
> >                 </colocated>
> >             </replication>
> >         </ha-policy>
> >
> >         <cluster-user>ARTEMIS.CLUSTER.ADMIN.USER</cluster-user>
> >         <cluster-password>[the shared cluster 
> > password]</cluster-password>
> >
> >         <security-settings>
> >             <security-setting match="#">
> >                 <permission type="createDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="createNonDurableQueue" roles="amq, 
> > other-role"  />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteNonDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >                 <permission type="manage" roles="amq" />
> >             </security-setting>
> >             <security-setting match="A.some.queue">
> >                 <permission type="createNonDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteNonDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="createDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >             </security-setting>
> >                 <security-setting match="A.some.other.queue">
> >                 <permission type="createNonDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteNonDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="createDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq, 
> > other-role" />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >             </security-setting>
> >             ...
> >             ... etc.
> >             ...
> >         </security-settings>
> >
> >         <address-settings>
> >             <address-setting match="activemq.management#">
> >                 <dead-letter-address>DLQ</dead-letter-address>
> >                 <expiry-address>ExpiryQueue</expiry-address>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-size-bytes>-1</max-size-bytes>
> >
> > <message-counter-history-day-limit>10</message-counter-
> > history-day-limit>
> >                 <address-full-policy>PAGE</address-full-policy>
> >             </address-setting>
> >             <!--default for catch all -->
> >             <address-setting match="#">
> >                 <dead-letter-address>DLQ</dead-letter-address>
> >                 <expiry-address>ExpiryQueue</expiry-address>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-size-bytes>-1</max-size-bytes>
> >
> > <message-counter-history-day-limit>10</message-counter-
> > history-day-limit>
> >                 <address-full-policy>PAGE</address-full-policy>
> >                 <redistribution-delay>1000</redistribution-delay>
> >             </address-setting>
> >             <address-setting match="DLQ">
> >                 <!-- 100 * 1024 * 1024 -> 100MB -->
> >                 <max-size-bytes>104857600</max-size-bytes>
> >                 <!-- 1000 * 60 * 60 -> 1h -->
> >                 <expiry-delay>3600000</expiry-delay>
> >                 <expiry-address />
> >             </address-setting>
> >             <address-setting match="A.some.queue">
> >                 <redelivery-delay-multiplier>1.0</redelivery-delay-
> > multiplier>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-redelivery-delay>10</max-redelivery-delay>
> >             </address-setting>
> >                 <address-setting match="A.some.other.queue">
> >                 <redelivery-delay-multiplier>1.0</redelivery-delay-
> > multiplier>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-redelivery-delay>10</max-redelivery-delay>
> >                 <max-delivery-attempts>1</max-delivery-attempts>
> >                 <max-size-bytes>104857600</max-size-bytes>
> >             </address-setting>
> >             ...
> >             ... etc.
> >             ...
> >         </address-settings>
> >
> >         <addresses>
> >             <address name="DLQ">
> >                 <anycast>
> >                     <queue name="DLQ" />
> >                 </anycast>
> >             </address>
> >             <address name="ExpiryQueue">
> >                 <anycast>
> >                     <queue name="ExpiryQueue" />
> >                 </anycast>
> >             </address>
> >             <address name="A.some.queue">
> >                 <anycast>
> >                     <queue name="A.some.queue">
> >                         <durable>true</durable>
> >                     </queue>
> >                 </anycast>
> >             </address>
> >             <address name="A.some.other.queue">
> >                 <anycast>
> >                     <queue name="A.some.other.queue">
> >                         <durable>true</durable>
> >                     </queue>
> >                 </anycast>
> >             </address>
> >             ...
> >             ... etc.
> >             ...
> >         </addresses>
> >     </core>
> > </configuration>
> >
>
Mime
View raw message