Return-Path: X-Original-To: apmail-activemq-commits-archive@www.apache.org Delivered-To: apmail-activemq-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6671FDC46 for ; Tue, 14 Aug 2012 07:05:41 +0000 (UTC) Received: (qmail 26530 invoked by uid 500); 14 Aug 2012 07:05:41 -0000 Delivered-To: apmail-activemq-commits-archive@activemq.apache.org Received: (qmail 26480 invoked by uid 500); 14 Aug 2012 07:05:39 -0000 Mailing-List: contact commits-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list commits@activemq.apache.org Received: (qmail 26447 invoked by uid 99); 14 Aug 2012 07:05:39 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2012 07:05:39 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2A4AF2C5ACA for ; Tue, 14 Aug 2012 07:05:38 +0000 (UTC) Date: Tue, 14 Aug 2012 18:05:38 +1100 (NCT) From: "Lionel Cons (JIRA)" To: commits@activemq.apache.org Message-ID: <1209336942.6385.1344927938173.JavaMail.jiratomcat@arcas> Subject: [jira] [Created] (APLO-241) Apollo becomes unresponsive under stress MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Lionel Cons created APLO-241: -------------------------------- Summary: Apollo becomes unresponsive under stress Key: APLO-241 URL: https://issues.apache.org/jira/browse/APLO-241 Project: ActiveMQ Apollo Issue Type: Bug Environment: apollo-99-trunk-20120813.171747-82 Reporter: Lionel Cons When trying to reproduce APLO-238, I found another problem :-( I ran stomp-benchmark with the attached scenario to simulate one topic consumer with many producers. As expected, stomp-benchmark reported many errors like: java.net.ConnectException: Connection timed out java.io.IOException: Connection reset by peer However, according to netstat, more than 10k connections have been established. stomp-benchmark eventually stopped, with some results: c_c1 samples: [ 1450,140,261,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ] p_p1 samples: [ 75052,447310,430373,431496,406670,436637,455825,436305,451396,449173,408920,491221,527663,556973,605508,580931,616605,576667,589194,606230,567179,426264,256924,152997,82039,42061,23405,11416,5965,3874,2660,1834,1740,1298,1026,660,742,639,533,496,400,175,124,79,124,124,123,124,123,87,0,0,0,0,0,0,0,0,0,0 ] e_p1 samples: [ 1832,76,0,0,0,3,664,1721,1,1,0,3,19,659,1704,0,5,8,10,670,910,785,4,10,8,26,654,1138,555,12,7,12,24,656,1674,13,11,8,17,20,655,1670,11,17,14,14,29,651,1664,15,16,16,16,669,901,772,19,16,15,25 ] p_p2 samples: [ 68831,398643,391879,389710,365791,393488,406712,382222,398429,407385,370296,439194,465552,496801,507263,412671,328124,216992,123357,77492,43132,19195,10099,6064,4780,2969,1255,766,886,879,849,677,995,932,860,486,551,552,551,415,247,149,110,117,330,330,330,294,419,444,394,441,196,220,158,217,221,148,111,97 ] e_p2 samples: [ 1704,82,0,0,0,1,767,1613,1,2,0,8,24,767,1601,0,1,1,12,786,782,814,7,3,8,28,770,984,604,7,10,8,24,773,1566,14,11,16,8,27,775,1557,17,8,15,10,32,775,1547,15,10,18,15,786,767,789,19,11,25,35 ] However, after the end of the test, Apollo does not respond anymore. Its REST API cannot be contacted (read timeout) and it cannot be stopped via the service script, only kill -9 works. Strangely, it's only using 100% of CPU (on multi-core) and 35% of memory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira