From issues-return-26086-archive-asf-public=cust-asf.ponee.io@activemq.apache.org Tue Feb 27 04:05:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id D849618064A for ; Tue, 27 Feb 2018 04:05:03 +0100 (CET) Received: (qmail 44019 invoked by uid 500); 27 Feb 2018 03:05:02 -0000 Mailing-List: contact issues-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list issues@activemq.apache.org Received: (qmail 44009 invoked by uid 99); 27 Feb 2018 03:05:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Feb 2018 03:05:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 5F321C6174 for ; Tue, 27 Feb 2018 03:05:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Qu6krRKzPnh7 for ; Tue, 27 Feb 2018 03:05:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 3415F5F183 for ; Tue, 27 Feb 2018 03:05:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A0C18E00B8 for ; Tue, 27 Feb 2018 03:05:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 583B624062 for ; Tue, 27 Feb 2018 03:05:00 +0000 (UTC) Date: Tue, 27 Feb 2018 03:05:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ARTEMIS-1700) Server stopped responding and killed itself while exiting paging state MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ARTEMIS-1700?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 377931#comment-16377931 ]=20 ASF GitHub Bot commented on ARTEMIS-1700: ----------------------------------------- Github user shoukunhuai commented on the issue: https://github.com/apache/activemq-artemis/pull/1899 =20 So it is a mistake to use global thread pool instead of io thread pool = for page cursor. =20 But this does not fix our problem, as you can see=20 ``` "Thread-274672 (ActiveMQ-server-org.apache.activemq.artemis.core.server= .impl.ActiveMQServerImpl$5@4e91d63f)" Id=3D274703 TIMED_WAITING on java.uti= l.concurrent.CountDownLatch$Sync@5c416651 =09at sun.misc.Unsafe.park(Native Method) =09- waiting on java.util.concurrent.CountDownLatch$Sync@5c416651 =09at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java= :215) =09at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSh= aredNanos(AbstractQueuedSynchronizer.java:1037) =09at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireS= haredNanos(AbstractQueuedSynchronizer.java:1328) =09at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277= ) =09at org.apache.activemq.artemis.core.journal.impl.SimpleWaitIOCallbac= k.waitCompletion(SimpleWaitIOCallback.java:73) =09at org.apache.activemq.artemis.core.persistence.impl.journal.Operati= onContextImpl.waitCompletion(OperationContextImpl.java:313) =09at org.apache.activemq.artemis.core.persistence.impl.journal.Abstrac= tJournalStorageManager.waitOnOperations(AbstractJournalStorageManager.java:= 294) =09at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorPro= viderImpl.storeBookmark(PageCursorProviderImpl.java:539) =09at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorPro= viderImpl.cleanupComplete(PageCursorProviderImpl.java:431) =09at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorPro= viderImpl.cleanup(PageCursorProviderImpl.java:383) =09- locked org.apache.activemq.artemis.core.paging.cursor.impl.PageCu= rsorProviderImpl@4f8d6d9a =09at org.apache.activemq.artemis.core.paging.cursor.impl.PageCursorPro= viderImpl$1.run(PageCursorProviderImpl.java:291) =09at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(O= rderedExecutor.java:42) =09at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(O= rderedExecutor.java:31) =09at org.apache.activemq.artemis.utils.actors.ProcessorBase$ExecutorTa= sk.run(ProcessorBase.java:53) =09at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecu= tor.java:1142) =09at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExec= utor.java:617) =09at java.lang.Thread.run(Thread.java:745) =20 =09Number of locked synchronizers =3D 2 =09- java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@613d= 010 =09- java.util.concurrent.ThreadPoolExecutor$Worker@4c03ae59 ``` When exit paging state, we will store bookmark for each page subscripti= on and wait until all callbacks done. I believe this may happen even running in io thread as long as singleTh= readExecutor in AbstractJournalStrorageManager use thread from global serve= r thread pool. > Server stopped responding and killed itself while exiting paging state > ---------------------------------------------------------------------- > > Key: ARTEMIS-1700 > URL: https://issues.apache.org/jira/browse/ARTEMIS-1700 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 2.4.0 > Reporter: Qihong Xu > Priority: Major > Attachments: artemis.log > > > We are currently experiencing=C2=A0this error while running stress test o= n artemis. > =C2=A0 > Basic configuration: > 1 broker ,1 topic, pub-sub mode. > Journal type =3D MAPPED.=C2=A0 > Threadpool max size =3D 60. > =C2=A0 > In order to test the throughput of artemis we use=C2=A0300 producers and = 300 consumers. However we found that sometimes when artemis exit paging sta= te, it will stop responding and kill itself. This situatuion happened on so= me specific servers. > =C2=A0 > Details can be found in attached dump file. > =C2=A0 > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)