activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Bertram <jbert...@apache.org>
Subject Re: [artemis 2.1.0] taking 30+ minutes to boot & failover
Date Fri, 20 Jul 2018 14:05:21 GMT
Analyzing thread dumps like this is pretty simple.  I generally just scroll
through and look for long stack-traces with lots of calls from
org.apache.activemq.artemis.  In your case every single thread dump has a
thread doing something like this:

"main" #1 prio=5 os_prio=0 tid=0x00007f902800eb20 nid=0x74fe runnable
[0x00007f9031675000]
   java.lang.Thread.State: RUNNABLE
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.getPossibleMatches(HierarchicalObjectRepository.java:373)
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.getMatch(HierarchicalObjectRepository.java:192)
    at
org.apache.activemq.artemis.core.paging.impl.PagingManagerImpl.reapplySettings(PagingManagerImpl.java:113)
    at
org.apache.activemq.artemis.core.paging.impl.PagingManagerImpl.onChange(PagingManagerImpl.java:108)
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.onChange(HierarchicalObjectRepository.java:348)
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.addMatch(HierarchicalObjectRepository.java:168)
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.addMatch(HierarchicalObjectRepository.java:147)
    at
org.apache.activemq.artemis.core.settings.impl.HierarchicalObjectRepository.addMatch(HierarchicalObjectRepository.java:120)
    at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.recoverStoredConfigs(ActiveMQServerImpl.java:2424)
    at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.loadJournals(ActiveMQServerImpl.java:2374)
    at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.initialisePart2(ActiveMQServerImpl.java:2219)
    - locked <0x0000000080a8fce8> (a
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl)
    at
org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation.run(SharedNothingLiveActivation.java:109)
    at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.internalStart(ActiveMQServerImpl.java:518)
    at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.start(ActiveMQServerImpl.java:466)
    - locked <0x0000000080a8fce8> (a
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl)
    at
org.apache.activemq.artemis.integration.FileBroker.start(FileBroker.java:111)
    - locked <0x000000008098be80> (a
org.apache.activemq.artemis.integration.FileBroker)
    at org.apache.activemq.artemis.cli.commands.Run.execute(Run.java:73)
    at
org.apache.activemq.artemis.cli.Artemis.internalExecute(Artemis.java:148)
    at org.apache.activemq.artemis.cli.Artemis.execute(Artemis.java:95)
    at org.apache.activemq.artemis.cli.Artemis.execute(Artemis.java:122)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.activemq.artemis.boot.Artemis.execute(Artemis.java:129)
    at org.apache.activemq.artemis.boot.Artemis.main(Artemis.java:49)

In every case it's the "main" thread which isn't surprising as that is the
thread responsible for starting the broker.  Also, you can pretty clearly
see in the trace that this is the thread starting the broker, it's loading
the journals, & restoring stored configuration (either address settings or
security settings).  I've seen high broker start times when there are lots
and lots of addresses and lots of and lots of settings.  Do either (or
both) of these situations apply to you?


Justin

On Fri, Jul 20, 2018 at 1:16 AM, Dan Langford <danlangford@gmail.com> wrote:

> Dang I can’t easily upgrade past 2.1.0 because of the config-delete-queues
> deserialization bug introduced in 2.2.0. Unless that bug was squashed in
> 2.6+. I don’t think I made a jira for it (vacation and work load) but we
> discussed it back in April. I should go confirm that bug on 2.6 and make a
> jira for that
>
> Thanks
> On Thu, Jul 19, 2018 at 5:46 PM Clebert Suconic <clebert.suconic@gmail.com
> >
> wrote:
>
> > There is an issue I remember where the journal would have some dirt that
> > was fixed on 2.3/0.
> >
> > I would ipgrade to 2.6.2.
> >
> > On Thu, Jul 19, 2018 at 6:34 PM Dan Langford <danlangford@gmail.com>
> > wrote:
> >
> > > would you be willing to help me translate these thread dumps?
> > >
> > > i attached a Zip file with some thread dumps in them. i will also share
> > > the fasthread.io links for each file. (i was struggling getting
> > > fastthread to do a combo report with the threads in the correct order)
> > >
> > > artemis04-20180719-1525 https://goo.gl/d88azU
> > > artemis04-20180719-1530 https://goo.gl/G78qn3
> > > artemis04-20180719-1535 https://goo.gl/aMBSBw
> > > artemis04-20180719-1540 https://goo.gl/brKxxk
> > > artemis04-20180719-1545 https://goo.gl/RaXXCs
> > > artemis04-20180719-1550 https://goo.gl/r5dndK
> > > artemis04-20180719-1555 https://goo.gl/YJRLxe
> > >
> > > at :35, :45, and :55 the young+old gen space gets bigger than at the
> > other
> > > sample times. but i dont know what to look for in here to determine
> what
> > > the broker is actually during during this time.
> > >
> > > thanks
> > >
> > > On Fri, Jul 6, 2018 at 10:42 AM Justin Bertram <jbertram@apache.org>
> > > wrote:
> > >
> > >> The first place I would start is grabbing thread dumps every so often
> to
> > >> see what the broker is actually doing during the 30+ minutes.
> > >>
> > >>
> > >> Justin
> > >>
> > >> On Fri, Jul 6, 2018 at 11:34 AM, Dan Langford <danlangford@gmail.com>
> > >> wrote:
> > >>
> > >> > so my server startup times and failover times are growing pretty
> big.
> > >> but i
> > >> > dont really know where to start looking.
> > >> >
> > >> > here is a snippet of some logs to show you the time stamps:
> > >> >
> > >> > 08:11:31,801 INFO  [org.apache.activemq.artemis.
> integration.bootstrap]
> > >> > AMQ101000: Starting ActiveMQ Artemis Server
> > >> >
> > >> > 08:11:31,818 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221000:
> > >> > live Message Broker is starting with configuration Broker
> > Configuration
> > >> > (clustered=true,journalDirectory=./data/
> journal,bindingsDirectory=./
> > >> > data/bindings,largeMessagesDirectory=./data/large-messages,
> > >> > pagingDirectory=./data/paging)
> > >> >
> > >> > 08:11:34,462 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221012:
> > >> > Using AIO Journal
> > >> >
> > >> > 08:11:34,493 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221057:
> > >> > Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx).
> > >> being
> > >> > defined as 1,073,741,824
> > >> >
> > >> > 08:11:34,555 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-server]. Adding protocol support
> for:
> > >> CORE
> > >> >
> > >> > 08:11:34,555 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-amqp-protocol]. Adding protocol
> > support
> > >> > for: AMQP
> > >> >
> > >> > 08:11:34,556 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-hornetq-protocol]. Adding protocol
> > >> support
> > >> > for: HORNETQ
> > >> >
> > >> > 08:11:34,556 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-mqtt-protocol]. Adding protocol
> > support
> > >> > for: MQTT
> > >> >
> > >> > 08:11:34,556 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-openwire-protocol]. Adding protocol
> > >> support
> > >> > for: OPENWIRE
> > >> >
> > >> > 08:11:34,556 INFO  [org.apache.activemq.artemis.core.server]
> > AMQ221043:
> > >> > Protocol module found: [artemis-stomp-protocol]. Adding protocol
> > support
> > >> > for: STOMP
> > >> >
> > >> > 08:41:38,963 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222165:
> > >> No
> > >> > Dead Letter Address configured for queue DLQ in AddressSettings
> > >> >
> > >> > 08:41:38,963 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222166:
> > >> No
> > >> > Expiry Address configured for queue DLQ in AddressSettings
> > >> >
> > >> > 08:41:38,983 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222165:
> > >> No
> > >> > Dead Letter Address configured for queue ExpiryQueue in
> > AddressSettings
> > >> >
> > >> > 08:41:38,983 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222166:
> > >> No
> > >> > Expiry Address configured for queue ExpiryQueue in AddressSettings
> > >> >
> > >> > 08:41:38,984 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222165:
> > >> No
> > >> > Dead Letter Address configured for queue example in AddressSettings
> > >> >
> > >> > 08:41:38,985 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222166:
> > >> No
> > >> > Expiry Address configured for queue example in AddressSettings
> > >> >
> > >> > 08:41:38,985 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222165:
> > >> No
> > >> > Dead Letter Address configured for queue exampleQueue in
> > AddressSettings
> > >> >
> > >> > 08:41:38,986 WARN  [org.apache.activemq.artemis.core.server]
> > AMQ222166:
> > >> No
> > >> > Expiry Address configured for queue exampleQueue in AddressSettings
> > >> >
> > >> >
> > >> > and it continues. i have 138 queues. i wonder if i need to be
> looking
> > at
> > >> > PAGE configuration or some cache sizes. do i need to be looking at
> the
> > >> > number of messages persisted on these queues? where would you look
> to
> > >> > determine why the startup times are so long?
> > >> >
> > >> > also, i know i need to upgrade but i cannot upgrade off of 2.1.0 due
> > to
> > >> > some deseralization changes introduced in 2.2.0. i think i have
> > another
> > >> > thread on here (that i need to update) regarding that upgrade issue.
> > >> >
> > >> > thanks for any ideas or insight you have for me
> > >> >
> > >>
> > > --
> > Clebert Suconic
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message