activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrei Bercu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARTEMIS-1170) HA Replication Failback Failure - Sync Timed Out
Date Fri, 19 May 2017 08:25:04 GMT

    [ https://issues.apache.org/jira/browse/ARTEMIS-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017055#comment-16017055
] 

Andrei Bercu commented on ARTEMIS-1170:
---------------------------------------

There are various files and dirs in the data dir before I start. 
I ran 'du -h' before starting on both master and slave in the broker data/ dirs and these
are the results:
{noformat}
# on master broker data/ dir
4.0K    ./paging/oldreplica.71/2a283c01-3c66-11e7-a133-525400ec0962
4.0K    ./paging/oldreplica.71
4.0K    ./paging/oldreplica.72/502d34be-3c66-11e7-a0d4-525400ec0962
4.0K    ./paging/oldreplica.72/502ebb5f-3c66-11e7-a0d4-525400ec0962
8.0K    ./paging/oldreplica.72
4.0K    ./paging/2f5f8072-3c69-11e7-a79c-525400ec0962
4.0K    ./paging/2f601cb3-3c69-11e7-a79c-525400ec0962
4.0K    ./paging/2f6043c4-3c69-11e7-a79c-525400ec0962
24K     ./paging
31M     ./journal/oldreplica.80
31M     ./journal/oldreplica.81
101M    ./journal
6.0M    ./bindings/oldreplica.80
6.0M    ./bindings/oldreplica.81
19M     ./bindings
0       ./largemessages
120M    .

# on the slave broker data/ dir  
4.0K    ./paging/oldreplica.73/2a835426-3c66-11e7-b87d-5254003aec1e
4.0K    ./paging/oldreplica.73/4fb3beb9-3c66-11e7-b87d-5254003aec1e
8.0K    ./paging/oldreplica.73
4.0K    ./paging/oldreplica.74/51a5cabc-3c66-11e7-b87d-5254003aec1e
4.0K    ./paging/oldreplica.74/51a63fed-3c66-11e7-b87d-5254003aec1e
4.0K    ./paging/oldreplica.74/2e8944b1-3c69-11e7-b87d-5254003aec1e
12K     ./paging/oldreplica.74
4.0K    ./paging/6b0212db-3c69-11e7-9767-5254003aec1e
4.0K    ./paging/6b0260fc-3c69-11e7-9767-5254003aec1e
4.0K    ./paging/6b0260fd-3c69-11e7-9767-5254003aec1e
32K     ./paging
30M     ./journal/oldreplica.100
4.0K    ./journal/oldreplica.101
70M     ./journal
6.0M    ./bindings/oldreplica.87
5.0M    ./bindings/oldreplica.88
16M     ./bindings
0       ./largemessages
86M     .
{noformat}

The total data/ dir size doesn't change much during tests.

Also, I checked queue stats on master with the command below before starting tests and the
message count values are 0 for all queues. Here is also the complete json result:
{noformat}
$ curl "http://localhost:8161/jolokia/read/org.apache.activemq.artemis:*"

{"request":{"mbean":"org.apache.activemq.artemis:*","type":"read"},"value":{"org.apache.activemq.artemis:address=\"exampleQueue\",broker=\"localhost\",component=addresses":{"RoutingTypesAsJSON":"[\"ANYCAST\"]","AddressSize":0,"NumberOfBytesPerPage":10485760,"Address":"exampleQueue","Paging":false,"Roles":[],"RolesAsJSON":"[]","NumberOfPages":0,"BindingNames":["jms.queue.exampleQueue"],"QueueNames":["jms.queue.exampleQueue"],"RoutingTypes":["ANYCAST"],"NumberOfMessages":0,"MessageCount":0},"org.apache.activemq.artemis:address=\"exampleQueue\",broker=\"localhost\",component=addresses,queue=\"jms.queue.exampleQueue\",routing-type=\"anycast\",subcomponent=queues":{"FirstMessageAge":null,"DeadLetterAddress":null,"MessagesAdded":0,"MaxConsumers":-1,"ConsumerCount":0,"Address":"exampleQueue","PurgeOnNoConsumers":false,"DeliveringCount":0,"MessagesAcknowledged":0,"MessagesExpired":0,"ScheduledCount":0,"MessagesKilled":0,"Name":"jms.queue.exampleQueue","Temporary":false,"FirstMessageTimestamp":null,"Filter":null,"Durable":true,"ExpiryAddress":null,"ID":4,"FirstMessageAsJSON":"[{}]","MessageCount":0,"RoutingType":"ANYCAST","Paused":false},"org.apache.activemq.artemis:broker=\"localhost\"":{"ManagementAddress":"activemq.management","ConnectorServices":[],"JournalMinFiles":2,"Clustered":true,"FailoverOnServerShutdown":false,"JournalCompactMinFiles":10,"PersistDeliveryCountBeforeDelivery":false,"MessageExpiryThreadPriority":3,"ConnectionTTLOverride":-1,"PersistenceEnabled":true,"JournalCompactPercentage":30,"CreateBindingsDir":true,"ConnectionCount":3,"Version":"2.1.0","JournalDirectory":"\/var\/lib\/artemis\/shadowbroker\/data\/journal","UptimeMillis":613281,"ManagementNotificationAddress":"activemq.notifications","DivertNames":[],"JournalSyncNonTransactional":true,"TotalConnectionCount":3,"SharedStore":false,"QueueNames":["jms.queue.exampleQueue"],"PagingDirectory":"\/var\/lib\/artemis\/shadowbroker\/data\/paging","LargeMessagesDirectory":"\/var\/lib\/artemis\/shadowbroker\/data\/largemessages","JournalBufferSize":501760,"SecurityInvalidationInterval":10000,"MessageCounterSamplePeriod":10000,"TotalMessagesAdded":0,"TotalConsumerCount":0,"BridgeNames":[],"WildcardRoutingEnabled":true,"PersistIDCache":true,"AsyncConnectionExecutionEnabled":true,"Started":true,"TotalMessageCount":0,"CreateJournalDir":true,"GlobalMaxSize":1123024896,"Backup":false,"AddressNames":["test-load","exampleQueue"],"TotalMessagesAcknowledged":0,"JournalBufferTimeout":500000,"MaxDiskUsage":90,"IncomingInterceptorClassNames":[],"ReplicaSync":true,"NodeID":"396809f1-2407-11e7-813d-525400ec0962","ConnectorsAsJSON":"[{\"name\":\"netty-connector\",\"factoryClassName\":\"org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory\",\"params\":{\"port\":\"61616\",\"host\":\"xxx\"},\"extraProps\":{}},{\"name\":\"netty-replication-connector\",\"factoryClassName\":\"org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory\",\"params\":{\"port\":\"61616\",\"host\":\"xxx\"},\"extraProps\":{}}]","JournalFileSize":10485760,"TransactionTimeout":300000,"MessageCounterEnabled":false,"OutgoingInterceptorClassNames":[],"DiskScanPeriod":5000,"IDCacheSize":20000,"Uptime":"10
minutes","JournalSyncTransactional":true,"ScheduledThreadPoolMaxSize":5,"TransactionTimeoutScanPeriod":1000,"SecurityEnabled":true,"BindingsDirectory":"\/var\/lib\/artemis\/shadowbroker\/data\/bindings","Connectors":[["netty-connector","org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory",{"port":"61616","host":"xxx"}],["netty-replication-connector","org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnectorFactory",{"port":"61616","host":"xxx"}]],"JournalMaxIO":500,"MessageExpiryScanPeriod":30000,"MessageCounterMaxDayCount":10,"JournalType":"ASYNCIO","ThreadPoolMaxSize":30},"org.apache.activemq.artemis:broker=\"localhost\",component=cluster-connections,name=\"my-cluster\"":{"Started":true,"Address":"jms","MessageLoadBalancingType":"ON_DEMAND","Topology":"topology
on Topology@421353f4[owner=ClusterConnectionImpl@1042306518[nodeUUID=396809f1-2407-11e7-813d-525400ec0962,
connector=TransportConfiguration(name=netty-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?port=61616&host=qa-oc-txq1-ops-ag-com, address=jms, server=ActiveMQServerImpl::serverUUID=396809f1-2407-11e7-813d-525400ec0962]]:\n\t396809f1-2407-11e7-813d-525400ec0962
=> TopologyMember[id = 396809f1-2407-11e7-813d-525400ec0962, connector=Pair[a=TransportConfiguration(name=netty-connector,
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=61616&host=qa-oc-txq1-ops-ag-com,
b=TransportConfiguration(name=netty-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
?port=61616&host=qa-oc-txq2-ops-ag-com], backupGroupName=null, scaleDownGroupName=null]\n\tnodes=2\tmembers=1","MaxHops":1,"Nodes":{},"Name":"my-cluster","DiscoveryGroupName":null,"DuplicateDetection":true,"StaticConnectors":["netty-replication-connector"],"NodeID":"396809f1-2407-11e7-813d-525400ec0962","RetryInterval":500,"StaticConnectorsAsJSON":"[\"netty-replication-connector\"]"},"org.apache.activemq.artemis:broker=\"localhost\",component=acceptors,name=\"artemis\"":{"FactoryClassName":"org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptorFactory","Started":true,"Parameters":{"tcpReceiveBufferSize":"1048576","port":"61616","host":"0.0.0.0","protocols":"CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE","tcpSendBufferSize":"1048576"},"Name":"artemis"},"org.apache.activemq.artemis:broker=\"localhost\",component=acceptors,name=\"stomp\"":{"FactoryClassName":"org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptorFactory","Started":true,"Parameters":{"tcpReceiveBufferSize":"1048576","port":"61613","host":"0.0.0.0","protocols":"STOMP","tcpSendBufferSize":"1048576"},"Name":"stomp"},"org.apache.activemq.artemis:address=\"test-load\",broker=\"localhost\",component=addresses":{"RoutingTypesAsJSON":"[\"ANYCAST\"]","AddressSize":0,"NumberOfBytesPerPage":10485760,"Address":"test-load","Paging":false,"Roles":[["test-user",true,true,true,true,true,true,false]],"RolesAsJSON":"[{\"name\":\"test-user\",\"send\":true,\"consume\":true,\"createDurableQueue\":true,\"deleteDurableQueue\":true,\"createNonDurableQueue\":true,\"deleteNonDurableQueue\":true,\"manage\":false,\"browse\":true,\"createAddress\":true,\"deleteAddress\":true}]","NumberOfPages":0,"BindingNames":[],"QueueNames":[],"RoutingTypes":["ANYCAST"],"NumberOfMessages":0,"MessageCount":0}},"timestamp":1495181906,"status":200}
{noformat}

> HA Replication Failback Failure  - Sync Timed Out
> -------------------------------------------------
>
>                 Key: ARTEMIS-1170
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1170
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.1.0
>            Reporter: Andrei Bercu
>            Assignee: clebert suconic
>
> We are running artemis in a master/slave setup (1 master, 1 slave) using replication.
We had artemis 2.0.0 and reported this issue: [ARTEMIS-1161|https://issues.apache.org/jira/browse/ARTEMIS-1161]
. Following advice, we replaced artemis 2.0.0 with 2.1.0 and the failback still doesn't work,
failing in a different form though, in about 50% of our test experiments.
> Now, during failback, master reports:
> {noformat}
> 07:11:22,962 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting
ActiveMQ Artemis Server
> 07:11:22,975 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: live Message
Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=/var/lib/artemis/shadowbroker/data/journal,bindingsDirectory=/var/lib/artemis/shadowbroker/data/bindings,largeMessagesDirectory=/var/lib/artemis/shadowbroker/data/largemessages,pagingDirectory=/var/lib/artemis/shadowbroker/data/paging)
> 07:11:23,416 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too
many old replicated folders upon startup, removing /var/lib/artemis/shadowbroker/data/bindings/oldreplica.313
> 07:11:23,421 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory
/var/lib/artemis/shadowbroker/data/bindings to /var/lib/artemis/shadowbroker/data/bindings/oldreplica.315
> 07:11:23,423 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too
many old replicated folders upon startup, removing /var/lib/artemis/shadowbroker/data/journal/oldreplica.313
> 07:11:23,426 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory
/var/lib/artemis/shadowbroker/data/journal to /var/lib/artemis/shadowbroker/data/journal/oldreplica.315
> 07:11:23,427 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too
many old replicated folders upon startup, removing /var/lib/artemis/shadowbroker/data/paging/oldreplica.252
> 07:11:23,427 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory
/var/lib/artemis/shadowbroker/data/paging to /var/lib/artemis/shadowbroker/data/paging/oldreplica.254
> 07:11:23,437 INFO  [org.eclipse.jetty.util.log] Logging initialized @1968ms to org.eclipse.jetty.util.log.Slf4jLog
> 07:11:23,448 INFO  [org.apache.activemq.artemis.core.server] AMQ221012: Using AIO Journal
> 07:11:23,533 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-server]. Adding protocol support for: CORE
> 07:11:23,533 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
> 07:11:23,534 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
> 07:11:23,534 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
> 07:11:23,534 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
> 07:11:23,535 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module
found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
> 07:11:23,591 INFO  [org.eclipse.jetty.server.Server] jetty-9.4.z-SNAPSHOT
> 07:11:23,643 INFO  [org.apache.activemq.artemis.core.server] AMQ221109: Apache ActiveMQ
Artemis Backup Server version 2.1.0 [null] started, waiting live to fail before it gets active
> 07:11:23,702 INFO  [org.eclipse.jetty.webapp.StandardDescriptorProcessor] NO JSP Support
for /jolokia, did not find org.eclipse.jetty.jsp.JettyJspServlet
> 07:11:23,713 INFO  [org.eclipse.jetty.server.session] DefaultSessionIdManager workerName=node0
> 07:11:23,714 INFO  [org.eclipse.jetty.server.session] No SessionScavenger set, using
defaults
> 07:11:23,715 INFO  [org.eclipse.jetty.server.session] Scavenging every 660000ms
> 07:11:23,764 INFO  [org.eclipse.jetty.ContextHandler.jolokia] jolokia-agent: No access
restrictor found, access to any MBean is allowed
> 07:11:23,910 INFO  [org.eclipse.jetty.server.handler.ContextHandler] Started o.e.j.w.WebAppContext@127a7a2e{/jolokia,file:///var/lib/artemis/shadowbroker/tmp/jetty-localhost-8161-jolokia.war-_jolokia-any-6088373165208569604.dir/webapp/,AVAILABLE}{/opt/artemis/web/jolokia.war}
> 07:11:23,911 INFO  [org.eclipse.jetty.server.handler.ContextHandler] Started o.e.j.s.h.ContextHandler@14008db3{/,file:///opt/artemis/web/,AVAILABLE}
> 07:11:23,925 INFO  [org.eclipse.jetty.server.AbstractConnector] Started ServerConnector@28d18df5{HTTP/1.1,[http/1.1]}{localhost:8161}
> 07:11:23,926 INFO  [org.eclipse.jetty.server.Server] Started @2459ms
> 07:11:23,927 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://localhost:8161
> 07:11:23,927 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API
available at http://localhost:8161/jolokia
> 07:11:24,745 INFO  [org.apache.activemq.artemis.core.server] AMQ221024: Backup server
ActiveMQServerImpl::serverUUID=47e70ff8-2b2c-11e7-84d3-525400ec0962 is synchronized with live-server.
> 07:11:24,760 INFO  [org.apache.activemq.artemis.core.server] AMQ221031: backup announced
> 07:12:54,442 INFO  [org.apache.activemq.artemis.core.server] AMQ221037: ActiveMQServerImpl::serverUUID=47e70ff8-2b2c-11e7-84d3-525400ec0962
to become 'live'
> 07:12:54,556 WARN  [org.apache.activemq.artemis.core.server] AMQ222165: No Dead Letter
Address configured for queue test-load in AddressSettings
> 07:12:54,556 WARN  [org.apache.activemq.artemis.core.server] AMQ222166: No Expiry Address
configured for queue test-load in AddressSettings
> 07:12:54,910 INFO  [org.apache.activemq.artemis.core.server] AMQ221007: Server is now
live
> 07:12:54,933 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL
Acceptor at 0.0.0.0:61616 for protocols [CORE,MQTT,AMQP,STOMP,HORNETQ,OPENWIRE]
> 07:12:54,936 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL
Acceptor at 0.0.0.0:61613 for protocols [STOMP]
> {noformat}
> \* Note the logs: _Backup server ActiveMQServerImpl::serverUUID=xxx is synchronized with
live-server_ / _backup announced_
> and slave reports:
> {noformat}
> 07:11:23,818 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication:
sending AIOSequentialFile:/var/lib/artemis/shadowbroker/data/journal/activemq-data-1759.amq
(size=10,485,760) to replica.
> 07:11:24,352 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication:
sending NIOSequentialFile /var/lib/artemis/shadowbroker/data/bindings/activemq-bindings-1448.bindings
(size=1,048,576) to replica.
> 07:11:24,366 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication:
sending NIOSequentialFile /var/lib/artemis/shadowbroker/data/bindings/activemq-bindings-1456.bindings
(size=1,048,576) to replica.
> 07:11:24,389 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication:
sending NIOSequentialFile /var/lib/artemis/shadowbroker/data/bindings/activemq-bindings-1447.bindings
(size=1,048,576) to replica.
> 07:12:54,399 WARN  [org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager]
AMQ119114: Replication synchronization process timed out after waiting 90,000 milliseconds:
java.lang.IllegalStateException: AMQ119114: Replication synchronization process timed out
after waiting 90,000 milliseconds
>         at org.apache.activemq.artemis.core.replication.ReplicationManager.sendSynchronizationDone(ReplicationManager.java:607)
[artemis-server-2.1.0.jar:2.1.0]
>         at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:579)
[artemis-server-2.1.0.jar:2.1.0]
>         at org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:171)
[artemis-server-2.1.0.jar:2.1.0]
>         at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_92]
> 07:12:54,401 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure
has been detected: AMQ119014: Did not receive data from /172.23.12.5:37606 within the 60,000ms
connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
> 07:12:54,401 WARN  [org.apache.activemq.artemis.core.server] AMQ222092: Connection to
the backup node failed, removing replication now: ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT
message=AMQ119014: Did not receive data from /172.23.12.5:37606 within the 60,000ms connection
TTL. The connection will now be closed.]
>         at org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread$2.run(RemotingServiceImpl.java:723)
[artemis-server-2.1.0.jar:2.1.0]
>         at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101)
[artemis-commons-2.1.0.jar:2.1.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[rt.jar:1.8.0_92]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[rt.jar:1.8.0_92]
>         at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_92]
> 07:12:54,406 WARN  [org.apache.activemq.artemis.core.server] AMQ222013: Error when trying
to start replication: java.lang.IllegalStateException: AMQ119114: Replication synchronization
process timed out after waiting 90,000 milliseconds
>         at org.apache.activemq.artemis.core.replication.ReplicationManager.sendSynchronizationDone(ReplicationManager.java:607)
[artemis-server-2.1.0.jar:2.1.0]
>         at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:579)
[artemis-server-2.1.0.jar:2.1.0]
>         at org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:171)
[artemis-server-2.1.0.jar:2.1.0]
>         at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_92]
> 07:12:54,412 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure
has been detected: AMQ119014: Did not receive data from /172.20.98.64:45224 within the 60,000ms
connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
> 07:12:54,412 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8431e0ae-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,416 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8431e0ae-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,416 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8432a3ff-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,418 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8432a3ff-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,418 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 84338e60-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,421 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 84338e60-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,421 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8435b141-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,422 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8435b141-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,422 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8436e9c2-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,424 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8436e9c2-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,424 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8437fb33-3af1-11e7-ac67-0acf5bbeb077
> 07:12:54,425 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8437fb33-3af1-11e7-ac67-0acf5bbeb077
> 07:13:31,653 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8431e090-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,654 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8431e090-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,654 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8432a3e1-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,655 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8432a3e1-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,655 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 84336732-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,655 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 84336732-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,655 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 84340373-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,656 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 84340373-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,656 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 8435b124-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,656 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 8435b124-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,656 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection
failed, clearing up resources for session 84367475-3af1-11e7-b91a-0acf5bbeb077
> 07:13:31,663 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources
for session 84367475-3af1-11e7-b91a-0acf5bbeb077
> {noformat}
> \* Note the logs: _Replication synchronization process timed out after waiting 90,000
milliseconds_ / _Did not receive data from /xxx:37606 within the 60,000ms connection TTL_
> Same as before, master has the following ha-policy:
> {code:xml}
>   <ha-policy>
>      <replication>
>         <master>
>             <check-for-live-server>true</check-for-live-server>
>             <initial-replication-sync-timeout>90000</initial-replication-sync-timeout>
>         </master>
>      </replication>
>   </ha-policy>
> {code}
> and slave has:
> {code:xml}
>   <ha-policy>
>      <replication>
>         <slave>
>            <allow-failback>true</allow-failback>
>            <initial-replication-sync-timeout>90000</initial-replication-sync-timeout>
>         </slave>
>      </replication>
>   </ha-policy>
> {code}
> For running the tests, we send messages using the artemis producer - command:
> {noformat}
> ./artemis producer --url "tcp://xxx:61616?ha=true&retryInterval=1000&retryIntervalMultiplier=1.0&reconnectAttempts=-1"
--destination queue://test-load --message-count 60000 --user xxx --password xxx --sleep 200
--message-size 4000 --threads 5
> {noformat}
> and consume messages with the artemis consumer - command:
> {noformat}
> ./artemis consumer --url "tcp://xxx:61616?ha=true&retryInterval=1000&retryIntervalMultiplier=1.0&reconnectAttempts=-1"
--destination queue://test-load --threads 5 --message-count 60000 --user xxx --password xxx
> {noformat}
> Please let us know if we can provide some other info about this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message