qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith W <keith.w...@gmail.com>
Subject Re: Qpid broker 6.0.4 performance issues
Date Fri, 28 Oct 2016 12:37:06 GMT
Hi Ramayan

QPID-7462 is a new (experimental) feature, so we don't consider this
appropriate for inclusion in the 6.0.5 defect release  We follow a
Semantic Versioning[1] strategy.

The underlying issue is your testing has uncovered is poor performance
with large numbers of consumers.  QPID-7462 effectively side steps the
problem (by introducing alternative consumer behaviour) but does not
address the root cause. We continue to consider how best to resolve
the problem completely, but don't yet have timelines for this change.
It is something that will be getting attention in what remains of this
year.  We will keep you posted.

In the meanwhile, I understand this causes you a problem.  If you
cannot adopt 6.1 (there should be another RC out soon), you could
consider applying the patch (attached to the JIRA) to 6.0.x branch and
building yourself.

Kind regards, Keith.


[1] http://semver.org


On 27 October 2016 at 23:19, Ramayan Tiwari <ramayan.tiwari@gmail.com> wrote:
> Hi Rob,
>
> I have the truck code which I am testing with, I haven't finished the test
> runs yet. I was hoping that once I validate the change, I can simply
> release 6.0.5.
>
> Thanks
> Ramayan
>
> On Thu, Oct 27, 2016 at 12:41 PM, Rob Godfrey <rob.j.godfrey@gmail.com>
> wrote:
>
>> Hi Ramayan,
>>
>> did you verify that the change works for you?  You said you were going to
>> test with the trunk code...
>>
>> I'll discuss with the other developers tomorrow about whether we can put
>> this change into 6.0.5.
>>
>> Cheers,
>> Rob
>>
>> On 27 October 2016 at 20:30, Ramayan Tiwari <ramayan.tiwari@gmail.com>
>> wrote:
>>
>> > Hi Rob,
>> >
>> > I looked at the release notes for 6.0.5 and it doesn't include the fix
>> for
>> > large consumers issues [1]. The fix is marked for 6.1, which will not
>> have
>> > JMX and for us to use this version requires major changes in our
>> monitoring
>> > framework. Could you please include the fix in 6.0.5 release?
>> >
>> > Thanks
>> > Ramayan
>> >
>> > [1]. https://issues.apache.org/jira/browse/QPID-7462
>> >
>> > On Wed, Oct 19, 2016 at 4:49 PM, Helen Kwong <helenkwong@gmail.com>
>> wrote:
>> >
>> > > Hi Rob,
>> > >
>> > > Again, thank you so much for answering our questions and providing a
>> > patch
>> > > so quickly :) One more question I have: would it be possible to include
>> > > test cases involving many queues and listeners (in the order of
>> thousands
>> > > of queues) for future Qpid releases, as part of standard perf testing
>> of
>> > > the broker?
>> > >
>> > > Thanks,
>> > > Helen
>> > >
>> > > On Tue, Oct 18, 2016 at 10:40 AM, Ramayan Tiwari <
>> > ramayan.tiwari@gmail.com
>> > > > wrote:
>> > >
>> > >> Thanks so much Rob, I will test the patch against trunk and will
>> update
>> > >> you with the outcome.
>> > >>
>> > >> - Ramayan
>> > >>
>> > >> On Tue, Oct 18, 2016 at 2:37 AM, Rob Godfrey <rob.j.godfrey@gmail.com
>> >
>> > >> wrote:
>> > >>
>> > >>> On 17 October 2016 at 21:50, Rob Godfrey <rob.j.godfrey@gmail.com>
>> > >>> wrote:
>> > >>>
>> > >>> >
>> > >>> >
>> > >>> > On 17 October 2016 at 21:24, Ramayan Tiwari <
>> > ramayan.tiwari@gmail.com>
>> > >>> > wrote:
>> > >>> >
>> > >>> >> Hi Rob,
>> > >>> >>
>> > >>> >> We are certainly interested in testing the "multi queue
consumers"
>> > >>> >> behavior
>> > >>> >> with your patch in the new broker. We would like to know:
>> > >>> >>
>> > >>> >> 1. What will the scope of changes, client or broker or
both? We
>> are
>> > >>> >> currently running 0.16 client, so would like to make sure
that we
>> > will
>> > >>> >> able
>> > >>> >> to use these changes with 0.16 client.
>> > >>> >>
>> > >>> >>
>> > >>> > There's no change to the client.  I can't remember what was
in the
>> > 0.16
>> > >>> > client... the only issue would be if there are any bugs in
the
>> > parsing
>> > >>> of
>> > >>> > address arguments.  I can try to test that out tmr.
>> > >>> >
>> > >>>
>> > >>>
>> > >>> OK - with a little bit of care to get round the address parsing
>> issues
>> > in
>> > >>> the 0.16 client... I think we can get this to work.  I've created
the
>> > >>> following JIRA:
>> > >>>
>> > >>> https://issues.apache.org/jira/browse/QPID-7462
>> > >>>
>> > >>> and attached to it are a patch which applies against trunk, and
a
>> > >>> separate
>> > >>> patch which applies against the 6.0.x branch (
>> > >>> https://svn.apache.org/repos/asf/qpid/java/branches/6.0.x - this
is
>> > >>> 6.0.4
>> > >>> plus a few other fixes which we will soon be releasing as 6.0.5)
>> > >>>
>> > >>> To create a consumer which uses this feature (and multi queue
>> > >>> consumption)
>> > >>> for the 0.16 client you need to use something like the following
as
>> the
>> > >>> address:
>> > >>>
>> > >>> queue_01 ; {node : { type : queue }, link : { x-subscribes : {
>> > >>> arguments : { x-multiqueue : [ queue_01, queue_02, queue_03 ],
>> > >>> x-pull-only : true }}}}
>> > >>>
>> > >>>
>> > >>> Note that the initial queue_01 has to be a name of an actual queue
on
>> > >>> the virtual host, but otherwise it is not actually used (if you
were
>> > >>> using a 0.32 or later client you could just use '' here).  The
actual
>> > >>> queues that are consumed from are in the list value associated
with
>> > >>> x-multiqueue.  For my testing I created a list with 3000 queues
here
>> > >>> and this worked fine.
>> > >>>
>> > >>> Let me know if you have any questions / issues,
>> > >>>
>> > >>> Hope this helps,
>> > >>> Rob
>> > >>>
>> > >>>
>> > >>> >
>> > >>> >
>> > >>> >> 2. My understanding is that the "pull vs push" change
is only with
>> > >>> respect
>> > >>> >> to broker and it does not change our architecture where
we use
>> > >>> >> MessageListerner to receive messages asynchronously.
>> > >>> >>
>> > >>> >
>> > >>> > Exactly - this is only a change within the internal broker
>> threading
>> > >>> > model.  The external behaviour of the broker remains essentially
>> > >>> unchanged.
>> > >>> >
>> > >>> >
>> > >>> >>
>> > >>> >> 3. Once I/O refactoring is completely, we would be able
to go back
>> > to
>> > >>> use
>> > >>> >> standard JMS consumer (Destination), what is the timeline
and
>> broker
>> > >>> >> release version for the completion of this work?
>> > >>> >>
>> > >>> >
>> > >>> > You might wish to continue to use the "multi queue" model,
>> depending
>> > on
>> > >>> > your actual use case, but yeah once the I/O work is complete
I
>> would
>> > >>> hope
>> > >>> > that you could use the thousands of consumers model should
you
>> wish.
>> > >>> We
>> > >>> > don't have a schedule for the next phase of I/O rework right
now -
>> > >>> about
>> > >>> > all I can say is that it is unlikely to be complete this year.
 I'd
>> > >>> need to
>> > >>> > talk with Keith (who is currently on vacation) as to when
we think
>> we
>> > >>> may
>> > >>> > be able to schedule it.
>> > >>> >
>> > >>> >
>> > >>> >>
>> > >>> >> Let me know once you have integrated the patch and I will
re-run
>> our
>> > >>> >> performance tests to validate it.
>> > >>> >>
>> > >>> >>
>> > >>> > I'll make a patch for 6.0.x presently (I've been working on
a
>> change
>> > >>> > against trunk - the patch will probably have to change a bit
to
>> apply
>> > >>> to
>> > >>> > 6.0.x).
>> > >>> >
>> > >>> > Cheers,
>> > >>> > Rob
>> > >>> >
>> > >>> > Thanks
>> > >>> >> Ramayan
>> > >>> >>
>> > >>> >> On Sun, Oct 16, 2016 at 3:30 PM, Rob Godfrey <
>> > rob.j.godfrey@gmail.com
>> > >>> >
>> > >>> >> wrote:
>> > >>> >>
>> > >>> >> > OK - so having pondered / hacked around a bit this
weekend, I
>> > think
>> > >>> to
>> > >>> >> get
>> > >>> >> > decent performance from the IO model in 6.0 for your
use case
>> > we're
>> > >>> >> going
>> > >>> >> > to have to change things around a bit.
>> > >>> >> >
>> > >>> >> > Basically 6.0 is an intermediate step on our IO /
threading
>> model
>> > >>> >> journey.
>> > >>> >> > In earlier versions we used 2 threads per connection
for IO (one
>> > >>> read,
>> > >>> >> one
>> > >>> >> > write) and then extra threads from a pool to "push"
messages
>> from
>> > >>> >> queues to
>> > >>> >> > connections.
>> > >>> >> >
>> > >>> >> > In 6.0 we move to using a pool for the IO threads,
and also
>> > stopped
>> > >>> >> queues
>> > >>> >> > from "pushing" to connections while the IO threads
were acting
>> on
>> > >>> the
>> > >>> >> > connection.  It's this latter fact which is screwing
up
>> > performance
>> > >>> for
>> > >>> >> > your use case here because what happens is that on
each network
>> > >>> read we
>> > >>> >> > tell each consumer to stop accepting pushes from
the queue until
>> > >>> the IO
>> > >>> >> > interaction has completed.  This is causing lots
of loops over
>> > your
>> > >>> 3000
>> > >>> >> > consumers on each session, which is eating up a lot
of CPU on
>> > every
>> > >>> >> network
>> > >>> >> > interaction.
>> > >>> >> >
>> > >>> >> > In the final version of our IO refactoring we want
to remove the
>> > >>> >> "pushing"
>> > >>> >> > from the queue, and instead have the consumers "pull"
- so that
>> > the
>> > >>> only
>> > >>> >> > threads that operate on the queues (outside of housekeeping
>> tasks
>> > >>> like
>> > >>> >> > expiry) will be the IO threads.
>> > >>> >> >
>> > >>> >> > So, what we could do (and I have a patch sitting
on my laptop
>> for
>> > >>> this)
>> > >>> >> is
>> > >>> >> > to look at using the "multi queue consumers" work
I did for you
>> > guys
>> > >>> >> > before, but augmenting this so that the consumers
work using a
>> > >>> "pull"
>> > >>> >> model
>> > >>> >> > rather than the push model.  This will guarantee
strict fairness
>> > >>> between
>> > >>> >> > the queues associated with the consumer (which was
the issue you
>> > had
>> > >>> >> with
>> > >>> >> > this functionality before, I believe).  Using this
model you'd
>> > only
>> > >>> >> need a
>> > >>> >> > small number (one?) of consumers per session.  The
patch I have
>> is
>> > >>> to
>> > >>> >> add
>> > >>> >> > this "pull" mode for these consumers (essentially
this is a
>> > preview
>> > >>> of
>> > >>> >> how
>> > >>> >> > all consumers will work in the future).
>> > >>> >> >
>> > >>> >> > Does this seem like something you would be interested
in
>> pursuing?
>> > >>> >> >
>> > >>> >> > Cheers,
>> > >>> >> > Rob
>> > >>> >> >
>> > >>> >> > On 15 October 2016 at 17:30, Ramayan Tiwari <
>> > >>> ramayan.tiwari@gmail.com>
>> > >>> >> > wrote:
>> > >>> >> >
>> > >>> >> > > Thanks Rob. Apologies for sending this over
weekend :(
>> > >>> >> > >
>> > >>> >> > > Are there are docs on the new threading model?
I found this on
>> > >>> >> > confluence:
>> > >>> >> > >
>> > >>> >> > > https://cwiki.apache.org/confluence/display/qpid/IO+
>> > >>> >> > Transport+Refactoring
>> > >>> >> > >
>> > >>> >> > > We are also interested in understanding the
threading model a
>> > >>> little
>> > >>> >> > better
>> > >>> >> > > to help us figure our its impact for our usage
patterns. Would
>> > be
>> > >>> very
>> > >>> >> > > helpful if there are more docs/JIRA/email-threads
with some
>> > >>> details.
>> > >>> >> > >
>> > >>> >> > > Thanks
>> > >>> >> > >
>> > >>> >> > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey
<
>> > >>> rob.j.godfrey@gmail.com
>> > >>> >> >
>> > >>> >> > > wrote:
>> > >>> >> > >
>> > >>> >> > > > So I *think* this is an issue because of
the extremely large
>> > >>> number
>> > >>> >> of
>> > >>> >> > > > consumers.  The threading model in v6 means
that whenever a
>> > >>> network
>> > >>> >> > read
>> > >>> >> > > > occurs for a connection, it iterates over
the consumers on
>> > that
>> > >>> >> > > connection
>> > >>> >> > > > - obviously where there are a large number
of consumers this
>> > is
>> > >>> >> > > > burdensome.  I fear addressing this may
not be a trivial
>> > >>> change...
>> > >>> >> I
>> > >>> >> > > shall
>> > >>> >> > > > spend the rest of my afternoon pondering
this...
>> > >>> >> > > >
>> > >>> >> > > > - Rob
>> > >>> >> > > >
>> > >>> >> > > > On 15 October 2016 at 17:14, Ramayan Tiwari
<
>> > >>> >> ramayan.tiwari@gmail.com>
>> > >>> >> > > > wrote:
>> > >>> >> > > >
>> > >>> >> > > > > Hi Rob,
>> > >>> >> > > > >
>> > >>> >> > > > > Thanks so much for your response.
We use transacted
>> sessions
>> > >>> with
>> > >>> >> > > > > non-persistent delivery. Prefetch
size is 1 and every
>> > message
>> > >>> is
>> > >>> >> same
>> > >>> >> > > > size
>> > >>> >> > > > > (200 bytes).
>> > >>> >> > > > >
>> > >>> >> > > > > Thanks
>> > >>> >> > > > > Ramayan
>> > >>> >> > > > >
>> > >>> >> > > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob
Godfrey <
>> > >>> >> > rob.j.godfrey@gmail.com>
>> > >>> >> > > > > wrote:
>> > >>> >> > > > >
>> > >>> >> > > > > > Hi Ramyan,
>> > >>> >> > > > > >
>> > >>> >> > > > > > this is interesting... in our
testing (which admittedly
>> > >>> didn't
>> > >>> >> > cover
>> > >>> >> > > > the
>> > >>> >> > > > > > case of this many queues / listeners)
we saw the 6.0.x
>> > >>> broker
>> > >>> >> using
>> > >>> >> > > > less
>> > >>> >> > > > > > CPU on average than the 0.32
broker.  I'll have a look
>> > this
>> > >>> >> weekend
>> > >>> >> > > as
>> > >>> >> > > > to
>> > >>> >> > > > > > why creating the listeners is
slower.  On the dequeing,
>> > can
>> > >>> you
>> > >>> >> > give
>> > >>> >> > > a
>> > >>> >> > > > > > little more information on the
usage pattern - are you
>> > using
>> > >>> >> > > > > transactions,
>> > >>> >> > > > > > auto-ack or client ack?  What
prefetch size are you
>> using?
>> > >>> How
>> > >>> >> > large
>> > >>> >> > > > are
>> > >>> >> > > > > > your messages?
>> > >>> >> > > > > >
>> > >>> >> > > > > > Thanks,
>> > >>> >> > > > > > Rob
>> > >>> >> > > > > >
>> > >>> >> > > > > > On 14 October 2016 at 23:46,
Ramayan Tiwari <
>> > >>> >> > > ramayan.tiwari@gmail.com>
>> > >>> >> > > > > > wrote:
>> > >>> >> > > > > >
>> > >>> >> > > > > > > Hi All,
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > We have been validating
the new Qpid broker (version
>> > >>> 6.0.4)
>> > >>> >> and
>> > >>> >> > > have
>> > >>> >> > > > > > > compared against broker
version 0.32 and are seeing
>> > major
>> > >>> >> > > > regressions.
>> > >>> >> > > > > > > Following is the summary
of our test setup and
>> results:
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > *1. Test Setup *
>> > >>> >> > > > > > >   *a). *Qpid broker runs
on a dedicated host (12
>> cores,
>> > >>> 32 GB
>> > >>> >> > RAM).
>> > >>> >> > > > > > >   *b).* For 0.32, we allocated
16 GB heap. For 6.0.6
>> > >>> broker,
>> > >>> >> we
>> > >>> >> > use
>> > >>> >> > > > 8GB
>> > >>> >> > > > > > > heap and 8GB direct memory.
>> > >>> >> > > > > > >   *c).* For 6.0.4, flow
to disk has been configured at
>> > >>> 60%.
>> > >>> >> > > > > > >   *d).* Both the brokers
use BDB host type.
>> > >>> >> > > > > > >   *e).* Brokers have around
6000 queues and we create
>> 16
>> > >>> >> listener
>> > >>> >> > > > > > > sessions/threads spread
over 3 connections, where each
>> > >>> >> session is
>> > >>> >> > > > > > listening
>> > >>> >> > > > > > > to 3000 queues. However,
messages are only enqueued
>> and
>> > >>> >> processed
>> > >>> >> > > > from
>> > >>> >> > > > > 10
>> > >>> >> > > > > > > queues.
>> > >>> >> > > > > > >   *f).* We enqueue 1 million
messages across 10
>> > different
>> > >>> >> queues
>> > >>> >> > > > > (evenly
>> > >>> >> > > > > > > divided), at the start of
the test. Dequeue only
>> starts
>> > >>> once
>> > >>> >> all
>> > >>> >> > > the
>> > >>> >> > > > > > > messages have been enqueued.
We run the test for 2
>> hours
>> > >>> and
>> > >>> >> > > process
>> > >>> >> > > > as
>> > >>> >> > > > > > > many messages as we can.
Each message runs for around
>> > 200
>> > >>> >> > > > milliseconds.
>> > >>> >> > > > > > >   *g).* We have used both
0.16 and 6.0.4 clients for
>> > these
>> > >>> >> tests
>> > >>> >> > > > (6.0.4
>> > >>> >> > > > > > > client only with 6.0.4 broker)
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > *2. Test Results *
>> > >>> >> > > > > > >   *a).* System Load Average
(read notes below on how
>> we
>> > >>> >> compute
>> > >>> >> > > it),
>> > >>> >> > > > > for
>> > >>> >> > > > > > > 6.0.4 broker is 5x compared
to 0.32 broker. During
>> start
>> > >>> of
>> > >>> >> the
>> > >>> >> > > test
>> > >>> >> > > > > > (when
>> > >>> >> > > > > > > we are not doing any dequeue),
load average is normal
>> > >>> (0.05
>> > >>> >> for
>> > >>> >> > > 0.32
>> > >>> >> > > > > > broker
>> > >>> >> > > > > > > and 0.1 for new broker),
however, while we are
>> dequeuing
>> > >>> >> > messages,
>> > >>> >> > > > the
>> > >>> >> > > > > > load
>> > >>> >> > > > > > > average is very high (around
0.5 consistently).
>> > >>> >> > > > > > >
>> > >>> >> > > > > > >   *b). *Time to create listeners
in new broker has
>> gone
>> > >>> up by
>> > >>> >> > 220%
>> > >>> >> > > > > > compared
>> > >>> >> > > > > > > to 0.32 broker (when using
0.16 client). For old
>> broker,
>> > >>> >> creating
>> > >>> >> > > 16
>> > >>> >> > > > > > > sessions each listening
to 3000 queues takes 142
>> seconds
>> > >>> and
>> > >>> >> in
>> > >>> >> > new
>> > >>> >> > > > > > broker
>> > >>> >> > > > > > > it took 456 seconds. If
we use 6.0.4 client, it took
>> > even
>> > >>> >> longer
>> > >>> >> > at
>> > >>> >> > > > > 524%
>> > >>> >> > > > > > > increase (887 seconds).
>> > >>> >> > > > > > >      *I).* The time to create
consumers increases as
>> we
>> > >>> create
>> > >>> >> > more
>> > >>> >> > > > > > > listeners on the same connections.
We have 20 sessions
>> > >>> (but
>> > >>> >> end
>> > >>> >> > up
>> > >>> >> > > > > using
>> > >>> >> > > > > > > around 5 of them) on each
connection and we create
>> about
>> > >>> 3000
>> > >>> >> > > > consumers
>> > >>> >> > > > > > and
>> > >>> >> > > > > > > attach MessageListener to
it. Each successive session
>> > >>> takes
>> > >>> >> > longer
>> > >>> >> > > > > > > (approximately linear increase)
to setup same number
>> of
>> > >>> >> consumers
>> > >>> >> > > and
>> > >>> >> > > > > > > listeners.
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > *3). How we compute System
Load Average *
>> > >>> >> > > > > > > We query the Mbean SysetmLoadAverage
and divide it by
>> > the
>> > >>> >> value
>> > >>> >> > of
>> > >>> >> > > > > MBean
>> > >>> >> > > > > > > AvailableProcessors. Both
of these MBeans are
>> available
>> > >>> under
>> > >>> >> > > > > > > java.lang.OperatingSystem.
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > I am not sure what is causing
these regressions and
>> > would
>> > >>> like
>> > >>> >> > your
>> > >>> >> > > > > help
>> > >>> >> > > > > > in
>> > >>> >> > > > > > > understanding it. We are
aware about the changes with
>> > >>> respect
>> > >>> >> to
>> > >>> >> > > > > > threading
>> > >>> >> > > > > > > model in the new broker,
are there any design docs
>> that
>> > >>> we can
>> > >>> >> > > refer
>> > >>> >> > > > to
>> > >>> >> > > > > > > understand these changes
at a high level? Can we tune
>> > some
>> > >>> >> > > parameters
>> > >>> >> > > > > to
>> > >>> >> > > > > > > address these issues?
>> > >>> >> > > > > > >
>> > >>> >> > > > > > > Thanks
>> > >>> >> > > > > > > Ramayan
>> > >>> >> > > > > > >
>> > >>> >> > > > > >
>> > >>> >> > > > >
>> > >>> >> > > >
>> > >>> >> > >
>> > >>> >> >
>> > >>> >>
>> > >>> >
>> > >>> >
>> > >>>
>> > >>
>> > >>
>> > >
>> >
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Mime
View raw message