qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom M <td.hom...@gmail.com>
Subject Re: problem with qpid heartbeats when sending msgs with size over 1KB
Date Wed, 07 Dec 2011 01:55:11 GMT

we are having a problem with our MRG (qpid) system:

* when sending messages with size of 1600bytes, a connection (used for
sending from client) does not detect the host connection is lost via
heartbeat timeout.

+ we are using C++ qpid client 0.7 and qpidd 0.7 (linux 2.6 x86_64 on both
client and broker hosts)

and Ethernet connection (TCP/IP) between hosts

    + for this connection we have: ConnectionSettings
connectionSettings.heartbeat = 8

    + simulating a system failure by pulling the ethernet cable to the
broker host

    + the connection close Exception is caught by the client after many
minutes (6 to 20mins), I'm guessing this is due to the TCP timeout and not
the missed heartbeats.

    + with the same exact application (for our client), if sending messages
of 200bytes, we do get the qpid exception indicating the Connection closed
(catch TransportFailure Exception: connection closed) within 16 seconds.
For this testing, there were no other changes between the 2 cases, other
than the size of the messages sent from the client (only expanded the size
of the string in the body of the message) (1 message sent per second in
both cases).

* is this a known problem with qpid 0.7?

* is there patch to fix this for qpid 0.7?

* has this problem already been fixed in later releases?

NOTE: we have already deployed qpid 0.7 in our system, and we will not be
able to upgrade to a newer full release for many months.

I'm wondering if the problem is that the connection gets blocked with the
first TCP packet of a multiple packet message, such that the heartbeat
detection is disabled until the full message is sent. But, if the
multi-packet message can not complete (since socket is broken), the
heartbeat logic is held disabled until the multi-packet message can
complete (which in this case it can not).


Tom Maggio

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message