qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Barto <jason.p.ba...@gmail.com>
Subject Re: Performance degradation over time during amqp 1.0 stress test
Date Fri, 05 Apr 2013 23:46:45 GMT
Rob,
I've had a chance to review the latest updates to QPid.  And while the
memory leak has been lessened it has not been removed (to be expected I
think).  However, on the downsie, the performance of the AMQP 1.0 code
seems to have declined significantly.  I'm attaching the results from 2
performance tests.  In each a single consumer and single producer is
communicating with the QPid-J broker, using messages of roughly 1KB in
size.  (I say roughly because it is a timestamp plus garbage data).  The 2
brokers compared were Qpid-J v0.20 and the latest pull from the GIT
repository.  Both tests were stopped after the host system (in this case
Macbook Pro, Intel Core i7, 64bit, 4 core, 8 GB ram) reported (via top)
that the JVM had consumed resident private address space exceeding 900 MB
with an mx of 1024.

The attachments show the number of messages sent and recieved per second,
followed by the latency of the messages.  Where I'm concerned is in the
throughput - QPid 0.20 demonstrates an AMQP 1.0 throughput n the order of
12K messages per second but with a large leak into ram.  Your recent work
into optimizing the 1.0 Java code has significantly reduced this, allowing
the tests to run for longer, however the performance is greatly reduced to
about 2000 messages per second.

Looking at VisualVM on OSX (I'm new to VisualVM), in the profiler for the
QPid proces, shows that the largest class allocation in the QPid proces, in
both cases, is of type 'byte[]'.  Not having dug into the code yet I'm not
sure what necessarily this means or even if its of any help.  However I'm
attaching my test results in addition.  I'll also say that while I've
gotten similar results using both client code bases, the tests attached to
this email were collected using the SwiftMQ Java client library.

Sincerely,
Jason


On Wed, Mar 27, 2013 at 8:45 PM, Rob Godfrey <rob.j.godfrey@gmail.com>wrote:

> Hi Jason,
>
> OK... so that's embarrassing :-) The reason that the broker stopped sending
> out messages in my test case was due to a deadlock in the broker side AMQP
> 1.0 codepath [1].  I've applied a fix to the trunk version of the broker,
> which I'll presently be asking for inclusion into the 0.22 release.
>
> Can you see if this also fixes the issue you have been seeing?
>
> Cheers,
> Rob
>
> [1] http://svn.apache.org/viewvc?view=revision&revision=r1461844
>
> On 27 March 2013 20:41, Rob Godfrey <rob.j.godfrey@gmail.com> wrote:
>
> > Hi Jason,
> >
> > just to let you know I am investigating the issue where the broker
> > suddenly stops receiving messages... I can see this too but I haven't yet
> > got to the bottom of why.  Apologies for any inconvenience.
> >
> > Cheers,
> > Rob
> >
> >
> > On 27 March 2013 17:51, Jason Barto <jason.p.barto@gmail.com> wrote:
> >
> >> Rob,
> >> first let me say thank you so much for your insight and quick responses.
> >> I'm sorry to say that throttling the producers has not completely solved
> >> the problem.  I'm back in the office today and have been doing
> additional
> >> testing and I think I may have detected a memory leak in the code that
> >> handles AMQP 1.0 communications.
> >>
> >> My experience thus far:
> >> I downloaded and installed the latest QPid revision using the git
> >> directions on the website.  With the Java QPid up and running I ran my
> >> first test client (AMQP 0-9-1).  I sent 60k messages/sec, each with a
> 1KB
> >> payload, for 180 seconds.  Using top I monitored the JVMs memory usage
> >> which held steady at 233MB.  The code for this client can be found on
> >> pastebin at http://pastebin.com/fwDmhxG9.  It uses the rabbitmq-client
> >> JAR
> >> for AMQP 0-9-1.
> >>
> >> Again using 'top' and the latest Java QPid I ran my second test client
> at
> >> 8k messages/sec, each with 1 KB payload, for 180 seconds.  The memory
> >> utilization of the JVM reported by 'top' rose steadily until the test
> >> ended, at which time the JVM was using 1.1GB of memory.  The code for
> this
> >> second test client is at http://pastebin.com/DNTLCj0y and uses v0.23 of
> >> the
> >> qpid-amqp-1-0-client JAR.  It may also be worth noting that at about
> >> second
> >> 147 the consumer just stopped receiving messages, no exceptions were
> >> thrown
> >> on either the client or broker, it simply stopped receiving messages.
> >>
> >> Both tests were throttled to ensure that the consumer would consume in a
> >> timely fashion roughly all messages sent, to avoid any buildup of
> messages
> >> in the broker's queue.
> >>
> >> I will admit that I have far more experience with the RabbitMQ client
> >> library than the QPid library so my hope is that there is something
> >> someone
> >> can spot in my test code that is not being done correctly that will
> >> explain
> >> why the JVM is consuming so much memory.  Does acknowledging a message
> >> with
> >> Receiver.acknowledge not sufficiently inform the broker that the message
> >> has been received and can be forgotten?
> >>
> >> On longer test runs, 500+ seconds, the broker eventually consumed all of
> >> its allocated ram and as you mentioned the GC began taking over trying
> to
> >> keep things running.
> >>
> >> Sincerely,
> >> Jason
> >>
> >>
> >> On Tue, Mar 26, 2013 at 12:43 PM, Rob Godfrey <rob.j.godfrey@gmail.com
> >> >wrote:
> >>
> >> > Hi Jason,
> >> >
> >> > On 26 March 2013 13:26, Jason Barto <jason.p.barto@gmail.com> wrote:
> >> >
> >> > > Rob,
> >> > > thanks for your quick response - I've consolidated the code into a
> >> single
> >> > > java file and would gladly publish it - frankly being new to AMQP
> 1.0
> >> and
> >> > > its client libraries I'd value the feedback.  I think I may have
> >> > determined
> >> > > the reason for the slow down - not sure why it didn't occur to me
> >> earlier
> >> > > to be honest.
> >> > >
> >> > > I have 2 threads, a Sender and Receiver.  Both are trying to
> produce /
> >> > > consume as rapidly as possible.  Testing on my laptop the Sender is
> >> > > publishing on average 20k messages / sec while the receiver seems
to
> >> top
> >> > > out at 15K msg/sec, leading to an inevitable backlog of messages
> >> building
> >> > > up on the broker.  When the queue on the broker is holding around
> 500k
> >> > > messages (after about 90 seconds of testing) the performance of the
> >> > broker
> >> > > drops dramatically until both producer / consumer can only send /
> >> receive
> >> > > about 200 msg/s.
> >> > >
> >> > >
> >> >
> >> > OK - great (sort of) - we can understand the problem - basically the
> >> broker
> >> > is not sufficiently pushing back on the producing client when the
> queue
> >> > becomes over full.
> >> >
> >> > The broker can be used to enforce flow control when using earlier
> >> versions
> >> > of the protocol (0-8, 0-9, 0-9-1, 0-10) but it's entirely possible
> that
> >> > this is not yet implemented in the 1-0 codepath (I shall have to go
> >> > check... if it is not I'll attempt to implement it shortly).
> >> >
> >> > In general the broker performance shouldn't degrade until it starts
> >> running
> >> > out of memory, whereupon the garbage collector will start dominating
> the
> >> > performance.
> >> >
> >> >
> >> > > If I put a limit on the number of messages / sec sent by the
> producer
> >> the
> >> > > problem goes away, ie the receiver can consume the messages at a
> rate
> >> to
> >> > > prevent a backlog of messages.  I'd be happy to create a JIRA
> request
> >> if
> >> > > you feel one is necessary; is there perhaps a datastore I could
> >> configure
> >> > > QPid to use that would help the system to better deal with the
> growing
> >> > > backlog mentioned?  I ask as a mitigation technique.  I'll be
> >> becoming a
> >> > > sysadmin for an enterprise-level message broker that will be
> expected
> >> to
> >> > > handle 1.75B messages per day (roughly 20k/s).  If for some reason
> the
> >> > > messages are not being consumed at the same rate they're being
> >> produced -
> >> > > what options do I have to ensure that QPid's performance is
> >> unaffected?
> >> > >
> >> > >
> >> > I guess the question is how large do you want to let the queue grow
> >> before
> >> > you have to enforce some sort of restriction on the sender?  Adding
> >> > flow-to-disk style capabilities often just makes the problem worse
> >> because
> >> > the consumer starts to become limited by the speed at which messages
> >> can be
> >> > read from disk, which may be lower than it's optimal rate... so you
> >> start
> >> > seeing even more queue growth.  I've seen people who want to be able
> to
> >> > handle spikes in queue growth of up to several million messages... and
> >> the
> >> > answer to their problem was just to stuff a lot more RAM into their
> >> machine
> >> > :)
> >> >
> >> > As an aside are you expecting these 20K/s messages to be AMQP 1.0?
>  As I
> >> > mentioned in my last mail, I've not yet done *any* perf tuning on the
> >> > AMQP1.0 code (JMS client or broker)... Though your question yesterday
> >> has
> >> > forced me to dig up my old copy of JProfiler and run it on the broker
> >> for
> >> > the first time today :-)
> >> >
> >> > Cheers,
> >> > Rob
> >> >
> >> >
> >> > > Sincerely,
> >> > > Jason
> >> > >
> >> > >
> >> > > On Mon, Mar 25, 2013 at 7:08 PM, Rob Godfrey <
> rob.j.godfrey@gmail.com
> >> > > >wrote:
> >> > >
> >> > > > Hi Jason,
> >> > > >
> >> > > > the AMQP 1-0 support in the java broker is somewhat experimental
> at
> >> > > > this stage (and no perf tuning has been carried out), however
it
> >> > > > definitely seems like you've found a bug. Can you share some
more
> >> > > > details of your test (perhaps raise a JIRA and attach the test
> >> code?)
> >> > > > and I'll try to look into it ASAP - hopefully that way we can
get
> a
> >> > > > fix in for 0.22,
> >> > > >
> >> > > > Thanks,
> >> > > > Rob
> >> > > >
> >> > > > On 25 March 2013 19:20, Jason Barto <jason.p.barto@gmail.com>
> >> wrote:
> >> > > > > Using amqp 0.9.1 I'm getting an average performance of 70k
> >> messages
> >> > per
> >> > > > > second throughput during an hour long test.
> >> > > > >
> >> > > > > When using amqp 1.0 I receive similar but smaller throughput
> >> numbers
> >> > > > > however after about 90 seconds of testing this quickly begins
to
> >> drop
> >> > > > until
> >> > > > > it gets to 10s or 100s of messages per second and a significant
> >> > backlog
> >> > > > of
> >> > > > > messages begins building.
> >> > > > >
> >> > > > > Can anyone explain this drastic difference in performance?
I am
> >> using
> >> > > > v0.20
> >> > > > > of qpid-j for the brokering.
> >> > > > >
> >> > > > > Sincerely,
> >> > > > > Jason
> >> > > >
> >> > > >
> >> ---------------------------------------------------------------------
> >> > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> >> > > > For additional commands, e-mail: users-help@qpid.apache.org
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
View raw message