qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adel Boutros <adelbout...@live.com>
Subject RE: [Java Broker- 6.0.1] AMQP random errors when sending messages using Proton-c 0.12.2
Date Wed, 10 Aug 2016 15:03:22 GMT
Done under https://issues.apache.org/jira/browse/PROTON-1278.
I have attached a reproducer to the issue.

Do you think reducing the MSL could be a temporary solution?

If I am interested in also trying to fix the issue, could you please point to where in the
code I should look?

Regards,
Adel

> Subject: Re: [Java Broker- 6.0.1] AMQP random errors when sending messages using Proton-c
0.12.2
> From: aconway@redhat.com
> To: users@qpid.apache.org
> Date: Wed, 10 Aug 2016 15:50:10 +0100
> 
> On Wed, 2016-08-10 at 16:06 +0200, Adel Boutros wrote:
> > Hello guys,
> > 
> > We found the issue. It is actually related to the TCP protocol. 
> > Using Wireshark, I can confirm:
> > * At AMQP level, all required messages are being sent including the
> > "detach" and "close.
> > * At TCP level, all required messages are being sent including the
> > "FIN" and "ACK" in both ways.
> > 
> > At that point, the connection is closed server-side but is in
> > "TIME_WAIT" on client-side.
> 
> That sounds to me like a bug on the client side, possibly in the
> library as your code is closing the connection. A connection stuck in
> TIME_WAIT is definitely not properly closed. Can you attach a simple
> reproducer to a JIRA for further investigation?
> 
> > What do you think about having a connection pool so that when we call
> > e.container().connect() we actually reuse an available connection
> > instead of opening a new one.
> 
> Re-using connections and senders is a good idea for performance - you
> won't get good throughput with the overhead of making a connection for
> every message. However, creating lots of short-lived connections should
> work without leaking so I think there is a bug to fix here.
>  
> > Regards,
> > Adel
> > 
> > [1] http://www.tcpipguide.com/free/t_TCPConnectionTermination-2.htm
> > [2] http://hea-www.harvard.edu/~fine/Tech/addrinuse.html
> > 
> > > 
> > > Subject: Re: [Java Broker- 6.0.1] AMQP random errors when sending
> > > messages using Proton-c 0.12.2
> > > From: aconway@redhat.com
> > > To: users@qpid.apache.org
> > > Date: Tue, 9 Aug 2016 17:44:42 +0100
> > > 
> > > On Tue, 2016-08-09 at 17:41 +0200, Adel Boutros wrote:
> > > > 
> > > > You were right. Using "netstat -t", I found I had more than 17
> > > > 000
> > > > ports opened by the test and the count was rising with time.
> > > > 
> > > > I will try to find out why. For your info, we set the timeout to
> > > > 0 so
> > > > I don't believe it is the issue.
> > > 
> > > The timeout was a pretty wild guess :)
> > > 
> > > IMO your close() code *should* close the connection so some
> > > speculations to investigate:
> > > 
> > > 1. it is never called - are you sure you are getting credit and no
> > > exceptions are thrown from sender::send() or sender::close()?
> > > 2. it is called but the close frame is never sent due to some
> > > proton
> > > bug, so the broker never closes.
> > > 3. the closed frame is sent, the broker closes but somehow the
> > > client
> > > end is keeping the connection open due to some proton bug.
> > > 4. something else...
> > > 
> > > PN_TRACE_FRM=1 should answer 1, if the right events are generated
> > > it
> > > was called.
> > > wireshark or broker-side logs should answer 2.
> > > Netstat should be able to answer 3. 
> > > Good luck with 4.
> > > 
> > > 
> > > 
> > > > 
> > > > Adel
> > > > 
> > > > > 
> > > > > 
> > > > > Subject: Re: [Java Broker- 6.0.1] AMQP random errors when
> > > > > sending
> > > > > messages using Proton-c 0.12.2
> > > > > From: aconway@redhat.com
> > > > > To: users@qpid.apache.org
> > > > > Date: Tue, 9 Aug 2016 14:22:33 +0100
> > > > > 
> > > > > On Tue, 2016-08-09 at 11:59 +0200, Adel Boutros wrote:
> > > > > > 
> > > > > > 
> > > > > > Hello Keith,
> > > > > > 
> > > > > > What's weird is I am closing the connection after each
> > > > > > message
> > > > > > send. 
> > > > > > 
> > > > > > Here is the main code:
> > > > > > try
> > > > > > {
> > > > > >    SimpleSenderHandler* simpleSenderHandler = new
> > > > > > SimpleSenderHandler(m_url, m_destination, m_message,
> > > > > > m_timeout);
> > > > > >    proton::container c(simpleSenderHandler);
> > > > > >    c.run();
> > > > > > }  catch (const proton::error& e)
> > > > > > {
> > > > > >    m_handler->signalErrorReceived(e.what());
> > > > > > }
> > > > > > 
> > > > > > Here is my proton::handler implementation:
> > > > > > 
> > > > > > SimpleSenderHandler::SimpleSenderHandler(const proton::url&
> > > > > > url,
> > > > > > const std::string& destination, const proton::message&
> > > > > > message,
> > > > > > int
> > > > > > timeout)
> > > > > > : m_url(url), m_destination(destination),
> > > > > > m_message(message),m_timeout(timeout)
> > > > > > {}
> > > > > > 
> > > > > > void SimpleSenderHandler::on_start(proton::event &e)
> > > > > > {
> > > > > >    proton::duration duration(m_timeout);
> > > > > >    proton::connection conn = e.container().connect(m_url,
> > > > > > proton::connection_options().idle_timeout(duration));
> > > > > >    conn.open_sender(m_destination);
> > > > > > }
> > > > > > 
> > > > > > void SimpleSenderHandler::on_sendable(proton::event &e)
> > > > > > {
> > > > > >    e.sender().send(m_message);
> > > > > >    e.sender().close();
> > > > > >    e.connection().close();
> > > > > > }
> > > > > > 
> > > > > > Isn't the above code enough to make sure the connection is
> > > > > > closed
> > > > > > after each send?
> > > > > 
> > > > > It should be, but your symptoms sound a lot like they are not
> > > > > fully
> > > > > closed and you are running out of ephemeral ports (similar to
> > > > > http:
> > > > > //st
> > > > > ackoverflow.com/questions/26019164/too-many-time-wait-
> > > > > connections-
> > > > > getting-cannot-assign-requested-address)
> > > > > 
> > > > > Some things to try:
> > > > > 
> > > > > - try dropping the idle_timeout setting. It shouldn't cause a
> > > > > problem
> > > > > but since connections seem to be hanging around, I speculate
> > > > > wildly
> > > > > that a client or broker bug related to the idle timeout might
> > > > > be
> > > > > delaying the close.
> > > > > 
> > > > > - Follow Keith's suggestions on checking broker management
> > > > > stats
> > > > > and
> > > > > run `netstat -t` to see if the connections really are closed at
> > > > > both
> > > > > ends or are stuck open, possibly on the broker end in
> > > > > TIME_WAIT.
> > > > > 
> > > > > - run wireshark to see if your client is really sending the
> > > > > AMQP
> > > > > close
> > > > > frames in case a client-side bug is preventing the close from
> > > > > completing even though your code is correct. The PN_TRACE_FRM=1
> > > > > output
> > > > > only proves that proton is processing the close event, it is
> > > > > still
> > > > > possible that it didn't make it to the wire.
> > > > > 
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Adel
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > From: keith.wall@gmail.com
> > > > > > > Date: Tue, 9 Aug 2016 08:40:48 +0100
> > > > > > > Subject: Re: [Java Broker- 6.0.1] AMQP random errors when
> > > > > > > sending
> > > > > > > messages using Proton-c 0.12.2
> > > > > > > To: users@qpid.apache.org
> > > > > > > 
> > > > > > > Hi Adel
> > > > > > > 
> > > > > > > On 8 August 2016 at 19:11, Adel Boutros <adelboutros@live.c
> > > > > > > om>
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Hello Alain, Keith,
> > > > > > > > 
> > > > > > > > I updated the Java Broker to 6.0.4 and re-run my test
> > > > > > > > case 5
> > > > > > > > times.
> > > > > > > > The good news is that the "AMQP header mismatch" error
> > > > > > > > has
> > > > > > > > disappeared.
> > > > > > > 
> > > > > > > Great.
> > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > The bad news is every time I ran my test, I got the
> > > > > > > > "on_transport_error: proton:io: connect: Cannot assign
> > > > > > > > requested
> > > > > > > > address" error.
> > > > > > > > 
> > > > > > > > I get the error after sending about 30 000 msgs. At
each
> > > > > > > > send, I
> > > > > > > > am creating a new container which means new connection
> > > > > > > > every
> > > > > > > > time.
> > > > > > > 
> > > > > > > That sounds suspiciously like TCP/IP connections are not
> > > > > > > being
> > > > > > > closed.
> > > > > > >   Check your client coding/configuration to ensure that
the
> > > > > > > connections are being closed.  You can confirm that
> > > > > > > connections
> > > > > > > are
> > > > > > > being closed by checking the connections associated with
> > > > > > > the
> > > > > > > Virtualhost in the Web Management Console (you will see
> > > > > > > them
> > > > > > > disappear
> > > > > > > from the table as they are closed, and check confirm all
is
> > > > > > > well
> > > > > > > under
> > > > > > > the bonnet with netstat/lsof.
> > > > > > > 
> > > > > > > Hope this helps.
> > > > > > > 
> > > > > > > Keith.
> > > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Adel
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Subject: Re: [Java Broker- 6.0.1] AMQP random
errors
> > > > > > > > > when
> > > > > > > > > sending messages using Proton-c 0.12.2
> > > > > > > > > From: aconway@redhat.com
> > > > > > > > > To: users@qpid.apache.org
> > > > > > > > > Date: Mon, 8 Aug 2016 17:57:01 +0100
> > > > > > > > > 
> > > > > > > > > On Fri, 2016-08-05 at 19:16 +0200, Adel Boutros
wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Hello,
> > > > > > > > > > 
> > > > > > > > > > I have a Proton-c C++ sender which sends
messages
> > > > > > > > > > using
> > > > > > > > > > the
> > > > > > > > > > container
> > > > > > > > > > API. For each message to send, I open a
new container
> > > > > > > > > > which
> > > > > > > > > > means
> > > > > > > > > > that for each message, I open and close
a new
> > > > > > > > > > connection.
> > > > > > > > > > I
> > > > > > > > > > am
> > > > > > > > > > getting 2 random exceptions on Linux in
the
> > > > > > > > > > "proton::handler::on_transport_error" method:
> > > > > > > > > > 
> > > > > > > > > > amqp:connection:framing-error: AMQP header
mismatch:
> > > > > > > > > > Insufficient
> > > > > > > > > > data to determine protocol [''] (connection
aborted).
> > > > > > > > > > and
> > > > > > > > > > proton:io: connect: Cannot assign requested
address.
> > > > > > > > > > 
> > > > > > > > > > I have used PN_TRACE_FRM=1 on the sender
and you can
> > > > > > > > > > find
> > > > > > > > > > below the
> > > > > > > > > > output.
> > > > > > > > > > 
> > > > > > > > > > What can the issue be in your opinion?
> > > > > > > > > 
> > > > > > > > > Not a very well considered opinion but: you will
see
> > > > > > > > > this
> > > > > > > > > kind
> > > > > > > > > of trace
> > > > > > > > > if something opens a TCP connection to your AMQP
> > > > > > > > > listener
> > > > > > > > > and
> > > > > > > > > closes it
> > > > > > > > > again without sending anything. Wireshark might
help
> > > > > > > > > you
> > > > > > > > > see if
> > > > > > > > > this is
> > > > > > > > > the problem or if there actually is data arriving
and
> > > > > > > > > proton is
> > > > > > > > > not
> > > > > > > > > understanding it.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > AMQP header mismatch
> > > > > > > > > > 
> > > > > > > > > > [0xe18730]:  -> AMQP
> > > > > > > > > > [0xe18730]:0 -> @open(16) [container-id="a9335275-
> > > > > > > > > > bb53-
> > > > > > > > > > 4565-
> > > > > > > > > > 8fdc-
> > > > > > > > > > c802702b4933", hostname="machine_name:10455",
> > > > > > > > > > channel-
> > > > > > > > > > max=32767]
> > > > > > > > > > [0xe18730]:0 -> @begin(17) [next-outgoing-id=0,
> > > > > > > > > > incoming-
> > > > > > > > > > window=2147483647, outgoing-window=2147483647]
> > > > > > > > > > [0xe18730]:0 -> @attach(18) [name="1/1",
handle=0,
> > > > > > > > > > role=false, snd-
> > > > > > > > > > settle-mode=2, rcv-settle-mode=0, source=@source(40)
> > > > > > > > > > [durable=0,
> > > > > > > > > > timeout=0, dynamic=false], target=@target(41)
> > > > > > > > > > [address="perf.topic",
> > > > > > > > > > durable=0, timeout=0, dynamic=false], initial-
> > > > > > > > > > delivery-
> > > > > > > > > > count=0]
> > > > > > > > > > [0xe18730]:0 -> @close(24) [error=@error(29)
> > > > > > > > > > [condition=:"amqp:connection:framing-error",
> > > > > > > > > > description="AMQP header
> > > > > > > > > > mismatch: Insufficient data to determine
protocol
> > > > > > > > > > ['']
> > > > > > > > > > (connection
> > > > > > > > > > aborted)"]]
> > > > > > > > > > [0xe18730]:  <- EOS
> > > > > > > > > >  amqp:connection:framing-error: AMQP header
mismatch:
> > > > > > > > > > Insufficient
> > > > > > > > > > data to determine protocol [''] (connection
aborted)
> > > > > > > > > > 
> > > > > > > > > > Cannot assign requested address
> > > > > > > > > > 
> > > > > > > > > > [0x160e730]:  -> AMQP
> > > > > > > > > > [0x160e730]:0 -> @open(16) [container-id=""]
> > > > > > > > > > [0x160e730]:0 -> @close(24) [error=@error(29)
> > > > > > > > > > [condition=:"proton:io", description="connect:
Cannot
> > > > > > > > > > assign
> > > > > > > > > > requested address"]]
> > > > > > > > > > proton:io: connect: Cannot assign requested
address
> > > > > > > > > > 
> > > > > > > > > > Successful message sent
> > > > > > > > > > 
> > > > > > > > > > [0xe18730]:  -> AMQP
> > > > > > > > > > [0xe18730]:0 -> @open(16) [container-id="95a81f15-
> > > > > > > > > > 6bb1-
> > > > > > > > > > 475f-
> > > > > > > > > > b7a3-
> > > > > > > > > > 1d7af71f6911", hostname="machine_name:10455",
> > > > > > > > > > channel-
> > > > > > > > > > max=32767]
> > > > > > > > > > [0xe18730]:0 -> @begin(17) [next-outgoing-id=0,
> > > > > > > > > > incoming-
> > > > > > > > > > window=2147483647, outgoing-window=2147483647]
> > > > > > > > > > [0xe18730]:0 -> @attach(18) [name="1/1",
handle=0,
> > > > > > > > > > role=false, snd-
> > > > > > > > > > settle-mode=2, rcv-settle-mode=0, source=@source(40)
> > > > > > > > > > [durable=0,
> > > > > > > > > > timeout=0, dynamic=false], target=@target(41)
> > > > > > > > > > [address="perf.topic",
> > > > > > > > > > durable=0, timeout=0, dynamic=false], initial-
> > > > > > > > > > delivery-
> > > > > > > > > > count=0]
> > > > > > > > > > [0xe18730]:  <- AMQP
> > > > > > > > > > [0xe18730]:0 <- @open(16) [container-id="5c480e45-
> > > > > > > > > > a289-
> > > > > > > > > > 4c16-
> > > > > > > > > > 947b-
> > > > > > > > > > f352419370af", max-frame-size=32768, channel-max=255,
> > > > > > > > > > idle-
> > > > > > > > > > time-
> > > > > > > > > > out=0, properties={:product="qpid", :version="6.0.1",
> > > > > > > > > > :"qpid.build"="1731621",
> > > > > > > > > > :"qpid.instance_name"="Broker"}]
> > > > > > > > > > [0xe18730]:0 <- @begin(17) [remote-channel=0,
next-
> > > > > > > > > > outgoing-
> > > > > > > > > > id=0,
> > > > > > > > > > incoming-window=2048, outgoing-window=2048]
> > > > > > > > > > [0xe18730]:0 <- @attach(18) [name="1/1",
handle=0,
> > > > > > > > > > role=true,
> > > > > > > > > > snd-
> > > > > > > > > > settle-mode=2, rcv-settle-mode=0, source=@source(40)
> > > > > > > > > > [durable=0,
> > > > > > > > > > timeout=0, dynamic=false], target=@target(41)
> > > > > > > > > > [address="perf.topic",
> > > > > > > > > > durable=0, timeout=0, dynamic=false]]
> > > > > > > > > > [0xe18730]:0 <- @flow(19) [next-incoming-id=0,
> > > > > > > > > > incoming-
> > > > > > > > > > window=2048,
> > > > > > > > > > next-outgoing-id=0, outgoing-window=2048,
handle=0,
> > > > > > > > > > delivery-
> > > > > > > > > > count=0,
> > > > > > > > > > link-credit=20000, echo=false]
> > > > > > > > > > [0xe18730]:0 -> @transfer(20) [handle=0,
delivery-
> > > > > > > > > > id=0,
> > > > > > > > > > delivery-
> > > > > > > > > > tag=b"\x0b\x13\x00\x00\x00\x00\x00\x00",
message-
> > > > > > > > > > format=0,
> > > > > > > > > > settled=false, more=false] (254)
> > > > > > > > > > "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\
> > > > > > > > > > x00\
> > > > > > > > > > x00S
> > > > > > > > > > s\xd0\x0
> > > > > > > > > > 0\x00\x00-
> > > > > > > > > > \x00\x00\x00\x0d@@@\xa1\x0atest@@@@\x83\x00\x00\x00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x00\x
> > > > > > > > > > 00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00St\x
> > > > > > > > > > d1\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x00E\
> > > > > > > > > > x00\x00\x00\x06\xa1\x0ctest\xa1\x02test\xa1\x09test\x
> > > > > > > > > > 81\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x01V[
> > > > > > > > > > \x7f\x914\xa1\x10test\xa1\x07test\x00Sw\xa0d\x00\x00\
> > > > > > > > > > x00\
> > > > > > > > > > x00\
> > > > > > > > > > x00\x00\
> > > > > > > > > > x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x00\x
> > > > > > > > > > 00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0
> > > > > > > > > > 0\x0
> > > > > > > > > > 0\x0
> > > > > > > > > > 0\x00\x0
> > > > > > > > > > 0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
> > > > > > > > > > \x00
> > > > > > > > > > \x00
> > > > > > > > > > \x00\x00
> > > > > > > > > > \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
> > > > > > > > > > x00\
> > > > > > > > > > x00\
> > > > > > > > > > x00\x00\
> > > > > > > > > > x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x
> > > > > > > > > > 00\x00\x
> > > > > > > > > > 00\x00\x00\x00\x00\x00\x00\x00"
> > > > > > > > > > [0xe18730]:0 -> @detach(22) [handle=0,
closed=true]
> > > > > > > > > > [0xe18730]:0 -> @close(24) []
> > > > > > > > > > [0xe18730]:  -> EOS
> > > > > > > > > > [0xe18730]:0 <- @disposition(21) [role=true,
first=0,
> > > > > > > > > > last=0,
> > > > > > > > > > settled=true, state=@accepted(36) []]
> > > > > > > > > > [0xe18730]:0 <- @detach(22) [handle=0,
closed=true]
> > > > > > > > > > [0xe18730]:0 <- @close(24) []
> > > > > > > > > > [0xe18730]:  <- EOS
> > > > > > > > > > 
> > > > > > > > > > Sender code
> > > > > > > > > > void SimpleSenderHandler::on_start(proton::event
&e)
> > > > > > > > > > {
> > > > > > > > > >    proton::duration duration(9999999);
> > > > > > > > > >    proton::connection conn =
> > > > > > > > > > e.container().connect(m_url,
> > > > > > > > > > proton::connection_options().idle_timeout(duration));
> > > > > > > > > >    conn.open_sender(m_destination);
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > void SimpleSenderHandler::on_sendable(proton::event
> > > > > > > > > > &e)
> > > > > > > > > > {
> > > > > > > > > >    e.sender().send(m_message);
> > > > > > > > > >    e.sender().close();
> > > > > > > > > >    e.connection().close();
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > Regards,
> > > > > > > > > > Adel
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > -----------------------------------------------------
> > > > > > > > > ----
> > > > > > > > > ----
> > > > > > > > > --------
> > > > > > > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.o
> > > > > > > > > rg
> > > > > > > > > For additional commands, e-mail: users-help@qpid.apache
> > > > > > > > > .org
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > ---------------------------------------------------------
> > > > > > > ----
> > > > > > > ----
> > > > > > > ----
> > > > > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > > > > > For additional commands, e-mail: users-help@qpid.apache.org
> > > > > > > 
> > > > > >  		 	   		  
> > > > > 
> > > > > 
> > > > > -------------------------------------------------------------
> > > > > ----
> > > > > ----
> > > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > > > For additional commands, e-mail: users-help@qpid.apache.org
> > > > > 
> > > >  		 	   		  
> > > 
> > > 
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > For additional commands, e-mail: users-help@qpid.apache.org
> > > 
> >  		 	   		  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
> 
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message