Return-Path: X-Original-To: apmail-activemq-users-archive@www.apache.org Delivered-To: apmail-activemq-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC11F4C1E for ; Mon, 11 Jul 2011 08:31:36 +0000 (UTC) Received: (qmail 6188 invoked by uid 500); 11 Jul 2011 08:31:36 -0000 Delivered-To: apmail-activemq-users-archive@activemq.apache.org Received: (qmail 5880 invoked by uid 500); 11 Jul 2011 08:31:24 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 5865 invoked by uid 99); 11 Jul 2011 08:31:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jul 2011 08:31:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of james.mk.green@gmail.com designates 209.85.214.171 as permitted sender) Received: from [209.85.214.171] (HELO mail-iw0-f171.google.com) (209.85.214.171) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jul 2011 08:31:13 +0000 Received: by iwn34 with SMTP id 34so4440147iwn.2 for ; Mon, 11 Jul 2011 01:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=xu/T2sPEhGtnFa9mPrkfQYZ10K7ebRWPf0o0ynTr7vI=; b=VLP8Ryga6Lvk5iR7bv/xEBoeJxWAUO0hbhJkxhQcrGvmiXC9dvtlDxYi1hFyTXs0uB ksY6MnP38HuMDpPNQXTlPr5y2WeznOTNFL3pvopoWgTjPz2x+5r8VvTCV/9lihBnx5j6 EGbakfBFIiCSuNuAegb3ndo4ZZQCteoFoAYMo= MIME-Version: 1.0 Received: by 10.231.61.198 with SMTP id u6mr4133618ibh.160.1310373051568; Mon, 11 Jul 2011 01:30:51 -0700 (PDT) Received: by 10.231.34.205 with HTTP; Mon, 11 Jul 2011 01:30:51 -0700 (PDT) In-Reply-To: References: Date: Mon, 11 Jul 2011 09:30:51 +0100 Message-ID: Subject: Re: EOFException from Stomp - heap space exhausted after enabling GC on queues From: James Green To: users@activemq.apache.org Content-Type: multipart/alternative; boundary=0015177411a84310f304a7c70002 X-Virus-Checked: Checked by ClamAV on apache.org --0015177411a84310f304a7c70002 Content-Type: text/plain; charset=ISO-8859-1 Sorry for adding another reply but this was the console output as I restarted - in case it has any bearing on things (the JMX URL thing we know about): root@quarrel:/opt/activemq/conf# /etc/init.d/activemq restart INFO: Loading '/etc/default/activemq' INFO: Using java '/usr/bin/java' INFO: Loading '/etc/default/activemq' INFO: Using java '/usr/bin/java' INFO: changing to user 'activemq' to invoke java No directory, logging in with HOME=/ INFO: Waiting at least 30 seconds for regular process termination of pid '29986' : Java Runtime: Sun Microsystems Inc. 1.6.0_20 /usr/lib/jvm/java-6-openjdk/jre Heap sizes: current=502464k free=499842k max=502464k JVM args: -Xms512M -Xmx512M -Dorg.apache.activemq.UseDedicatedTaskRunner=true -Djava.util.logging.config.file=logging.properties -Dactivemq.classpath=/opt/activemq/conf; -Dactivemq.home=/opt/activemq -Dactivemq.base=/opt/activemq ACTIVEMQ_HOME: /opt/activemq ACTIVEMQ_BASE: /opt/activemq .Connecting to pid: 29986 ERROR: java.lang.IllegalArgumentException: JMX URL already specified. java.lang.IllegalArgumentException: JMX URL already specified. at org.apache.activemq.console.command.AbstractJmxCommand.handleOption(AbstractJmxCommand.java:282) at org.apache.activemq.console.command.ShutdownCommand.handleOption(ShutdownCommand.java:156) at org.apache.activemq.console.command.AbstractCommand.parseOptions(AbstractCommand.java:73) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:45) at org.apache.activemq.console.command.AbstractJmxCommand.execute(AbstractJmxCommand.java:316) at org.apache.activemq.console.command.ShellCommand.runTask(ShellCommand.java:143) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) at org.apache.activemq.console.command.ShellCommand.main(ShellCommand.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.activemq.console.Main.runTaskClass(Main.java:251) at org.apache.activemq.console.Main.main(Main.java:107) .ERROR: java.lang.RuntimeException: Failed to execute stop task. Reason: java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.io.EOFException java.lang.RuntimeException: Failed to execute stop task. Reason: java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.io.EOFException at org.apache.activemq.console.command.ShutdownCommand.runTask(ShutdownCommand.java:107) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) at org.apache.activemq.console.command.AbstractJmxCommand.execute(AbstractJmxCommand.java:316) at org.apache.activemq.console.command.ShellCommand.runTask(ShellCommand.java:143) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) at org.apache.activemq.console.command.ShellCommand.main(ShellCommand.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.activemq.console.Main.runTaskClass(Main.java:251) at org.apache.activemq.console.Main.main(Main.java:107) ERROR: java.lang.Exception: java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.io.EOFException java.lang.Exception: java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.io.EOFException at org.apache.activemq.console.command.ShutdownCommand.runTask(ShutdownCommand.java:108) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) at org.apache.activemq.console.command.AbstractJmxCommand.execute(AbstractJmxCommand.java:316) at org.apache.activemq.console.command.ShellCommand.runTask(ShellCommand.java:143) at org.apache.activemq.console.command.AbstractCommand.execute(AbstractCommand.java:57) at org.apache.activemq.console.command.ShellCommand.main(ShellCommand.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.activemq.console.Main.runTaskClass(Main.java:251) at org.apache.activemq.console.Main.main(Main.java:107) Caused by: java.rmi.UnmarshalException: error unmarshalling return; nested exception is: java.io.EOFException at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:191) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.getConnectionId(Unknown Source) at javax.management.remote.rmi.RMIConnector.getConnectionId(RMIConnector.java:357) at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:319) at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267) at org.apache.activemq.console.command.AbstractJmxCommand.createJmxConnector(AbstractJmxCommand.java:212) at org.apache.activemq.console.command.AbstractJmxCommand.createJmxConnection(AbstractJmxCommand.java:237) at org.apache.activemq.console.command.ShutdownCommand.runTask(ShutdownCommand.java:71) ... 11 more Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2570) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368) at sun.rmi.server.UnicastRef.unmarshalValue(UnicastRef.java:324) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:173) ... 19 more ........................... INFO: Regular shutdown not successful, sending SIGKILL to process with pid '29986' INFO: Loading '/etc/default/activemq' INFO: Using java '/usr/bin/java' ActiveMQ not running INFO: Loading '/etc/default/activemq' INFO: Using java '/usr/bin/java' INFO: Starting - inspect logfiles specified in logging.properties and log4j.properties to get details INFO: changing to user 'activemq' to invoke java No directory, logging in with HOME=/ INFO: pidfile created : '/opt/activemq/data/activemq.pid' (pid '26049') INFO: Loading '/etc/default/activemq' INFO: Using java '/usr/bin/java' ActiveMQ is running (pid '26049') On 11 July 2011 09:28, James Green wrote: > Gary, > > This occurred over the weekend again. Given we didn't have customers at > desks there was more time to gain logs. > > Netstat shows an awful lot of: > > tcp 0 0 127.0.0.1:61612 127.0.0.1:58428 > SYN_RECV > tcp 0 0 127.0.0.1:61612 127.0.0.1:57881 > SYN_RECV > tcp 0 0 127.0.0.1:61612 127.0.0.1:56600 > SYN_RECV > tcp 0 0 127.0.0.1:61612 127.0.0.1:59161 > SYN_RECV > tcp 0 40 127.0.0.1:58048 127.0.0.1:61612 > FIN_WAIT1 > tcp 0 40 127.0.0.1:57805 127.0.0.1:61612 > FIN_WAIT1 > tcp 0 40 127.0.0.1:58907 127.0.0.1:61612 > FIN_WAIT1 > tcp 0 40 127.0.0.1:58964 127.0.0.1:61612 > FIN_WAIT1 > > 61612 is our STOMP connector, which at this point is no longer accepting > connections yet the web console is alive and kicking. All our current > clients use STOMP. > > According to the logs several hours beforehand the following occurred: > > 2011-07-10 00:59:36,673 | INFO | Outbound.Account.22373 Inactive for > longer than 60000 ms - removing ... | > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler > 2011-07-10 01:00:01,590 | INFO | Outbound.Account.20821 Inactive for > longer than 60000 ms - removing ... | > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler > 2011-07-10 01:00:01,687 | INFO | Outbound.Account.22686 Inactive for > longer than 60000 ms - removing ... | > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler > 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///127.0.0.1:53766 > 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///127.0.0.1:53768 > 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///127.0.0.1:53769 > > Many more of these are logged then: > > 2011-07-10 01:00:59,451 | WARN | Async error occurred: > java.lang.OutOfMemoryError: Java heap space | > org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: > tcp:///127.0.0.1:54121 > > java.lang.OutOfMemoryError: Java heap space > 2011-07-10 01:00:59,451 | WARN | Exception occurred processing: > null: java.lang.OutOfMemoryError: Java heap space | > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection > Dispatcher: /127.0.0.1:54121 > 2011-07-10 01:00:59,451 | WARN | Async error occurred: > java.lang.IllegalStateException: Cannot remove a consumer that had not been > registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 | > org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: > tcp:///127.0.0.1:54121 > java.lang.IllegalStateException: Cannot remove a consumer that had not been > registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 > at > org.apache.activemq.broker.TransportConnection.processRemoveConsumer(TransportConnection.java:571) > at org.apache.activemq.command.RemoveInfo.visit(RemoveInfo.java:76) > at > org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:306) > at > org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:179) > at > org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) > at > org.apache.activemq.transport.stomp.StompTransportFilter.sendToActiveMQ(StompTransportFilter.java:81) > at > org.apache.activemq.transport.stomp.ProtocolConverter.sendToActiveMQ(ProtocolConverter.java:140) > at > org.apache.activemq.transport.stomp.ProtocolConverter.onStompUnsubscribe(ProtocolConverter.java:468) > at > org.apache.activemq.transport.stomp.ProtocolConverter.onStompCommand(ProtocolConverter.java:190) > at > org.apache.activemq.transport.stomp.StompTransportFilter.onCommand(StompTransportFilter.java:70) > at > org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:220) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:636) > 2011-07-10 01:00:59,451 | WARN | Exception occurred processing: > null: java.lang.IllegalStateException: Cannot remove a consumer that had > not been registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 | > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection > Dispatcher: /127.0.0.1:54121 > 2011-07-10 01:00:59,453 | INFO | Transport failed: > java.net.SocketException: Broken pipe | > org.apache.activemq.broker.TransportConnection.Transport | Async Exception > Handler > > Repeatedly. Then: > > 2011-07-10 01:08:38,431 | WARN | Exception occurred processing: > null: java.lang.IllegalStateException: Timer already cancelled. | > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection > Dispatcher: /127.0.0.1:53858 > 2011-07-10 01:08:38,432 | INFO | Transport failed: java.io.EOFException | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///127.0.0.1:54143 > 2011-07-10 01:08:38,432 | INFO | Transport failed: java.io.EOFException | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > Transport: tcp:///127.0.0.1:54139 > > Repeatedly, then my ssl:// bridge to our hub machine restarted itself. At > this point STOMP is essentially dead. > > I seem to remember the last time that the logs mentioned this stuff > occuring around 0100. May be co-incidence of course. Unfortunately I have no > idea what could be causing this - the only two recent changes where: > > 1. Skip DLQ for expiring non-persistent messages > 2. Enable GC for queues. > > Given the severity I have disabled the GC based on your thoughts last time > - over the weekend we would expect the GC to kick in more then "usual". > Would appreciate any further insight as to whether this deserved a new issue > on it's own. > > James > > > On 4 July 2011 12:31, Gary Tully wrote: > >> It is just a hunch based on the root cause of that issue, a removed >> destination reference remaining in the connection contex. >> >> Feel free to open a separate issue, to be convinced we need a test case. >> >> On 4 July 2011 11:02, James Green wrote: >> > How convinced are you? >> > >> > I'm minded to file this as a seperate bug given the heap exceptions >> being >> > thrown and the Stomp connectivity that dies completely. What is not >> > explained in the bug report you list are the "other effects" of the bug >> > Timothy has fixed and whether there may be any overlap in the behaviours >> > being observed. >> > >> > James >> > >> > On 4 July 2011 10:44, Gary Tully wrote: >> > >> >> this looks like a bug, but I think it may be an instance of >> >> https://issues.apache.org/jira/browse/AMQ-2524 which has just been >> >> resolved so it is available in the most current 5.6-SNAPSHOT >> >> >> >> On 4 July 2011 09:31, James Green wrote: >> >> > Getting these: >> >> > >> >> > 2011-07-03 07:41:54,330 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:53546 >> >> > 2011-07-03 07:49:05,893 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:42055 >> >> > 2011-07-03 07:49:05,894 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:42045 >> >> > 2011-07-03 08:06:55,985 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:43304 >> >> > 2011-07-03 08:06:55,987 | INFO | Outbound.Account.22556 Inactive for >> >> longer >> >> > than 60000 ms - removing ... | >> org.apache.activemq.broker.region.Queue | >> >> > ActiveMQ Broker[zorin] Scheduler >> >> > 2011-07-03 08:06:59,566 | WARN | Async error occurred: >> >> > java.lang.OutOfMemoryError: Java heap space | >> >> > org.apache.activemq.broker.TransportConnection.Service | ActiveMQ >> >> Transport: >> >> > tcp:///127.0.0.1:42987 >> >> > 2011-07-03 08:06:59,566 | INFO | Outbound.Account.21011 Inactive for >> >> longer >> >> > than 60000 ms - removing ... | >> org.apache.activemq.broker.region.Queue | >> >> > ActiveMQ Broker[zorin] Scheduler >> >> > 2011-07-03 08:06:59,567 | WARN | Exception occurred processing: >> >> > null: java.lang.OutOfMemoryError: Java heap space | >> >> > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ >> >> Connection >> >> > Dispatcher: /127.0.0.1:42987 >> >> > >> >> > 2011-07-03 09:00:51,867 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57612 >> >> > 2011-07-03 09:00:51,868 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57634 >> >> > 2011-07-03 09:00:51,868 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57610 >> >> > 2011-07-03 09:00:54,148 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57643 >> >> > 2011-07-03 09:00:56,406 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57644 >> >> > 2011-07-03 09:00:57,535 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57646 >> >> > 2011-07-03 09:00:57,536 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57659 >> >> > 2011-07-03 09:00:57,536 | INFO | Transport failed: >> java.io.EOFException >> >> | >> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ >> >> > Transport: tcp:///127.0.0.1:57676 >> >> > 2011-07-03 13:32:36,995 | INFO | Outbound.Account.200237 Inactive >> for >> >> > longer than 60000 ms - removing ... | >> >> > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] >> >> Scheduler >> >> > 2011-07-03 23:58:46,994 | INFO | account_events Inactive for longer >> than >> >> > 60000 ms - removing ... | org.apache.activemq.broker.region.Queue | >> >> ActiveMQ >> >> > Broker[zorin] Scheduler >> >> > >> >> > The web console responds fine, the Stomp connector throws the >> exception. >> >> The >> >> > AMQ config shows -Xms512M -Xmx512M >> >> > -Dorg.apache.activemq.UseDedicatedTaskRunner=true >> >> > >> >> > This system has been running without incident for several months. Two >> >> weeks >> >> > ago I changed the config to enable gcInactiveDestinations="true" on >> >> queues. >> >> > Since then we've had two Stomp "crashes" inside one week. >> >> > >> >> > Does this GC process require much more RAM or something? Or am I >> likely >> >> > looking at a bug? >> >> > >> >> > Thanks, >> >> > >> >> > James >> >> > >> >> >> >> >> >> >> >> -- >> >> http://fusesource.com >> >> http://blog.garytully.com >> >> >> > >> >> >> >> -- >> http://fusesource.com >> http://blog.garytully.com >> > > --0015177411a84310f304a7c70002--