activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chad Kelly (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AMQ-5155) Heartbeat fails in STOMP over WebSockets
Date Sat, 18 Apr 2015 18:32:00 GMT

    [ https://issues.apache.org/jira/browse/AMQ-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14501051#comment-14501051
] 

Chad Kelly edited comment on AMQ-5155 at 4/18/15 6:31 PM:
----------------------------------------------------------

REVISION
Can someone confirm this bug occurs in their environment? If so, this bug relates to AMQ-5155
and AMQ-4674. I recommend that the ActiveMQ WebSocket example should work as is, regardless
of using parameters like transport.hbGracePeriodMultiplier.

Testing up to 12 continuous hours suggest that Fedora 21 with Chrome 39.0.2171.95 (x64) and
Chrome 41.0.2272.118 (x32) works correctly for all test cases. Wireshark (via linux on localhost)
shows that WebSocket/ActiveMQ always sends its “\n” (PONG) with a 0.1+ second gap between
Chrome sending its “\n” (PING). 

Fedora 21 with Firefox 34 (x64) and Firefox 37 (x32) ONLY works ~25% of the time. ~75% of
the time Firefox FAILS after 50 to 500 PINGs. Wireshark shows correct functionality when the
gap between WebSocket/ActiveMQ's “\n” (PONG) and Firefox's “\n” (PING) is larger than
+0.1 second. However, when the time gap between the PONG and PING is small, InactivityIOException
exception can occur ~75% of the time. For example, in the attached PCAP file titled “firefox_bug_example1.pcapng,
frame number 97 and 99 crosses over in PING/PONG timing.

Opening Web Socket...
Web Socket Opened...
>>> CONNECT
accept-version:1.1,1.0
heart-beat:10000,10000
login:guesttgg
passcode:guest

<<< CONNECTED
server:ActiveMQ/5.12-SNAPSHOT
heart-beat:10000,10000
session:ID:c1-39862-1429234022608-4:2
version:1.1

connected to server ActiveMQ/5.12-SNAPSHOT
send PING every 10000ms
check PONG every 10000ms
connected to Stomp
>>> SUBSCRIBE
id:sub-0
destination:/queue/test (same behavor for /topic/test)

<<< PONG
>>> PING
... 395 times this case, but 50-500 times in general
<<< PONG
>>> PING
<<< PONG
>>> PING
>>> PING
>>> PING
did not receive server activity for the last 20003ms
Whoops! Lost connection to ws://localhost:61614/stomp


In data/activemq.log
2015-04-17 15:49:12,750 | WARN  | Transport Connection to: StompSocket_919027643 failed: org.apache.activemq.transport.InactivityIOException:
Channel was inactive for too (>10000) long: StompSocket_919027643 | org.apache.activemq.broker.TransportConnection.Transport
| ActiveMQ InactivityMonitor Worker



was (Author: chadkelly):
Can someone confirm this bug occurs in their environment? If so, this bug relates to AMQ-5155
and AMQ-4674. I recommend that the ActiveMQ STOMP example should work as is, regardless of
using parameters like transport.hbGracePeriodMultiplier.

Testing up to 12 continuous hours suggest that Fedora 21 with Chrome 39.0.2171.95 (x64) and
Chrome 41.0.2272.118 (x32) works correctly for all test cases. Wireshark (via linux on localhost)
shows that WebSocket/ActiveMQ always sends its “\n” (PONG) followed by Chrome sending
its “\n” (PING). Chrome's 10 second timer tends to increase very slightly over time by
~+10.00453 per PING.

Fedora 21 with Firefox 34 (x64) and Firefox 37 (x32) ONLY works ~25% of the time. ~75% of
the time Firefox FAILS after 50 to 500 PINGs. Firefox will transmit at ~10 second interval
like WebSocket/ActiveMQ. Wireshark shows that WebSocket/ActiveMQ continuously works when it
sends its “\n” (PONG) followed by Firefox sending its “\n” (PING). However, Wireshark
show that when Firefox sends its “\n” (PING) before WebSocket/ActiveMQ sends it “\n”
(PONG), the InactivityIOException exception will occur. For example, in the attached PCAP
file titled “firefox_bug_example1.pcapng, frame number 97 has Firefox's “\n”(PING) first
followed by frame number 99 with WebSocket/ActiveMQ “\n” (PONG), and then within 10 seconds,
an InactivityIOException exception occurs.

Opening Web Socket...
Web Socket Opened...
>>> CONNECT
accept-version:1.1,1.0
heart-beat:10000,10000
login:guesttgg
passcode:guest

<<< CONNECTED
server:ActiveMQ/5.12-SNAPSHOT
heart-beat:10000,10000
session:ID:c1-39862-1429234022608-4:2
version:1.1

connected to server ActiveMQ/5.12-SNAPSHOT
send PING every 10000ms
check PONG every 10000ms
connected to Stomp
>>> SUBSCRIBE
id:sub-0
destination:/queue/test (same behavor for /topic/test)

<<< PONG
>>> PING
... 395 times this case, but 50-500 times in general
<<< PONG
>>> PING
<<< PONG
>>> PING
>>> PING
>>> PING
did not receive server activity for the last 20003ms
Whoops! Lost connection to ws://localhost:61614/stomp


In data/activemq.log
2015-04-17 15:49:12,750 | WARN  | Transport Connection to: StompSocket_919027643 failed: org.apache.activemq.transport.InactivityIOException:
Channel was inactive for too (>10000) long: StompSocket_919027643 | org.apache.activemq.broker.TransportConnection.Transport
| ActiveMQ InactivityMonitor Worker


> Heartbeat fails in STOMP over WebSockets
> ----------------------------------------
>
>                 Key: AMQ-5155
>                 URL: https://issues.apache.org/jira/browse/AMQ-5155
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.9.1, 5.10.0
>            Reporter: Arjan van den Berg
>            Assignee: Timothy Bish
>            Priority: Minor
>             Fix For: 5.12.0
>
>         Attachments: AMQ-5155-jetty8.diff, AMQ-5155-jetty9.diff, firefox_bug_example1.pcapng
>
>
> From AMQ-4740:
> I receive the following error after establishing a connection and heartbeat through stomp.js.
This seems to occur after the 'PING' is sent.
> ---------- stomp.js output
> <<< CONNECTED
> heart-beat:10000,10000
> session:ID:localhost.localdomain-45596-1396530920609-2:2
> server:ActiveMQ/5.10-SNAPSHOT
> version:1.1
> send PING every 10000ms 
> check PONG every 10000ms 
> <<< PONG 
> >>> PING 
> did not receive server activity for the last 20005ms 
> Whoops! Lost connection to ws://172.16.99.73:61614/stomp
> --------------------- activemq console ---------------
> WARN | Transport Connection to: StompSocket_19548821 failed: java.io.IOException
> Exception in thread "ActiveMQ InactivityMonitor Worker" java.lang.NullPointerException
> at org.apache.activemq.transport.AbstractInactivityMonitor.onException(AbstractInactivityMonitor.java:314)
> at org.apache.activemq.transport.AbstractInactivityMonitor$4.run(AbstractInactivityMonitor.java:215)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> WARN | Transport Connection to: StompSocket_19548821 failed: java.io.IOException
> Exception in thread "ActiveMQ InactivityMonitor Worker" java.lang.NullPointerException
> at org.apache.activemq.transport.AbstractInactivityMonitor.onException(AbstractInactivityMonitor.java:314)
> at org.apache.activemq.transport.AbstractInactivityMonitor$4.run(AbstractInactivityMonitor.java:215)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> For me it looks as if the StompInactivityMonitor is delivering its events to the wrong
Transport, i.e. it needs a "narrow()" when setting it up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message