qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen M <lefthandma...@gmail.com>
Subject Re: Batch/Bulk receive messages using java client?
Date Tue, 17 Jul 2012 14:05:00 GMT
Hi Robbie,

Thanks for writing back soon. Please see inline.

On Mon, Jul 16, 2012 at 3:32 PM, Robbie Gemmell <robbie.gemmell@gmail.com>wrote:

> Ok, so to check I understand correctly, and seek clarification on some
> points...
>
> You have potentially 30 application instances that have 5 connections, 20
> sessions per connection, and are each creating 2 consumers on all 6000
> priority queues (using 600 consumers per session), thus giving up to 150
> (30x5) connections, 3000 (30x5x20) sessions, and 360000 (30x2x6000)
> consumers?
>
> yes, that is correct.


> The consumers would only require 600 (360000/600) sessions, so can I assume
> the other 2400 sessions would be used for publishers, or have I
> misinterpreted something? (I am unclear on the '20-30' vs '15')
>
> Yes. You are correct again. However, i forgot to tell you that we have
dedicated connections for consumers(2 connections) vs publishers(5
connections). Thus it'd be 600 sessions for consumers and 3000 sessions for
publishers.


> How are the sessions for the consumers spread across the connections: all
> on 1 connection, 4 on each of the 5 connections, something else?
>

I have 2 connections dedicated to consumers (publishers won't use these
connections. I try to isolate publisher from consumer connections.). The 5
connections i mentioned above are used only by publishers. (sorry for being
not very clear earlier).

Since we have 2 connections for consumers, it's 10 consumer
sessions/connection/server


> Although you are ultimately looking to increase performance by batching, it
> is actually more the application processing steps you are looking to speed
> up by supplying more data at once rather than explicitly decreasing the
> actual messaging overhead (which if bounding performance due to round trips
> to the broker, can mean larger batches increasing message throughput).
>
> Yes that is correct.


> Although you would like processing across the queues to be fair, you dont
> actually have any explicit ordering requirements such as 'after processing
> messages from Queue X we must process Queue Foo'.
>
> Yes. There is no such ordering requirements.


> If each queue currently has up to 60 (30x2) consumers competing for the
> messages, does this mean you have no real ordering requirements
> (discounting priorities) when processing the messages on each queue, i.e it
> doesn't matter which application instances get a particular message, and
> say particular consumers could get and process the first and third messages
> whilst a slower consumer actually got and then later finished processing
> the second message? I ask because if you try to batch the messages on
> queues with multiple consumers and no prefetch (or even with prefetch) it
> isn't likely you would find consumers getting a sequential batch-sized
> group of messages (without introducing message grouping to the mix, that
> is) but rather instead get a message followed by other messages with one or
> more intermediate 'gaps' where competing consumers received those messages.
> Is that acceptable to whatever batched processing it is you are likely to
> be doing?
>
> yes. we do not have any ordering requirement. Yes we're ok with exactly
what you describe. Each message is independent of the other, and we do not
process messages in a workflow order anyway. We do not use any message
grouping (and do not plan to), and gaps are ok.


> You mentioned possibly only 100 queues servicing batch messages. Did you
> mean that you could know/decide in advance which those queues are, i.e they
> are readily identifiable in advance, or could it just be any 100 queues
> based on some condition at a given point in time?
>
> Yes. we could decide in advance and identify batch queues if required.

Thanks Robbie.


> Robbie
>
> On 16 July 2012 16:54, Praveen M <lefthandmagic@gmail.com> wrote:
>
> > Hi Robbie. Thank you for writing back. Please see inline for answers to
> > some of the questions you had.
> >
> > On Mon, Jul 16, 2012 at 4:40 AM, Robbie Gemmell <
> robbie.gemmell@gmail.com
> > >wrote:
> >
> > > Hi Praveen,
> > >
> > > I have talked this over with some of the others here, and tend to agree
> > > with Gordon and Rajith that mixing asynchronous and synchronous
> consumers
> > > in that fashion isn't a route I would really suggest; using two
> sessions
> > > makes for complication around transactionality and ordering, and I dont
> > > think it will work on a single session.
> > >
> > > We do have some ideas you could potentially use to implement batching
> in
> > > the application to improve performance, but there are various
> subtleties
> > to
> > > consider that might heavily influence our suggestions. As such we
> really
> > > need a good bit more detail around the use case to actually give a
> > reasoned
> > > answer. For example:
> > >
> > > - How many connections/sessions/consumers/queues are actually in use?
> > >
> >
> > In our current system, we have 20-30 client servers talking to our Qpid
> > messaging server.
> > We have 5 connections, 20 sessions/connection, 2 consumers/queue from a
> > single client server's standpoint.(so all the numbers should be
> multiplied
> > by a max factor of 30, since we could have upto 30 client servers).
> > We create overall 6000 queues in our Qpid messaging server.
> >
> >
> > > - Are there multiple consumers on each/any of the queues at the same
> > time?
> > >
> > Yes. To explain this a little bit,
> >
> > We have about 15 client servers, consuming messages.
> > we have 20 sessions(threads) consuming messages per client server. We
> have
> > broken the 6000 queues into 10 buckets, and have 2 sessions(threads)
> > listening/consuming on every 600 queues. Hence, an individual session
> might
> > try to listen and consume from 600 queues max on the same thread.
> >
> >
> > - What if any ordering requirements are there on the message processing
> > > (either within each queue or across all the queues)?
> > >
> > Across all queues, we'd like to process in a round-robin fashion to
> ensure
> > fairness across the queues. We achieve this now by turning off prefecting
> > (we're using prefetch 1, which works well).
> > Within the queue, all our queues are priority queues, so we process based
> > upon priority order.
> >
> >
> > > - What is the typical variation of message volumes across the queues
> that
> > > you are looking to balance?
> >
> > volumes vary quite a bit between queues(based upon the service the queue
> is
> > tied to). Some queues, have relatively low traffic, some have bursty, and
> > some have consistent high, and some with
> > slow consumers.
> > Our numbers are at a high of a million per day for a busy queue.
> >
> >
> > > - What are the typical message sizes?
> > >
> > Message sizes are typically arond 1KB-2KB
> >
> >
> > > - How many messages might you potentially be looking to batch?
> > >
> > The batch sizes are typically provided from our client applications, and
> > typically it's in the order of 10-50
> >
> >
> > > - What is the typical processing time in onMessage() now? Would this
> vary
> > > as a direct multipe of the number of messages batched, or by some other
> > > scaling?
> >
> >
> > The onMessage() callback invokes an application service, so I can't say
> > exactly...but with the effect of batching the processing time is
> typically
> > quite less than the direct multiple of the number of messages batched.
> >
> > Most typical use case for us, where messages are batched helps is, when a
> > database query is invoked with the batched messages thus performing a
> bulk
> > operation. This can be very expensive for us, if we do this in a
> one-by-one
> > order instead of batching the database query.
> > Also, typically batch message traffic is bursty, and our processing times
> > are quite high. From our current data, even though we have a multiple
> > consumer setup, batching helps us process efficiently for applications
> > which process messages in bulk.
> >
> > Also, out of all our queues. I would say, only about a 100 of them would
> be
> > servicing batch messages.
> >
> > Our current messaging infrastructure supports batch messages, and hence
> we
> > have a lot of dependent code written which expects batching. Getting out
> of
> > it now, might be quite tough at this point, hence I'd like to implement a
> > pseudo batch on top of Qpid. My original thought was around using 2
> > sessions, onMessage() and a synchronous consumer. I don't think we have
> > much concern with transactionality as we have our own reference to each
> > message in our database to guarantee transactionality.
> >
> > Do let me know what you think, and I'd love to hear if you can think of
> > alternate approaches to this problem.
> >
> > Hope to hear from you soon.
> >
> > Thanks,
> > Praveen
> >
> > Regards,
> > > Robbie
> > >
> > > On 12 July 2012 17:53, Praveen M <lefthandmagic@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm trying to explore if there are ways to batch message processing.
> > > > Batching message processing would help us improve performance for
> some
> > of
> > > > our use cases,
> > > > where we could chunk messages and process them in a single callback.
> > > >
> > > > Have anyone here explored building a layer to batch messages.
> > > >
> > > > I am using the Java Broker and the Java client.
> > > >
> > > > I would like to stick to the JMS api as much as possible.
> > > >
> > > > This is what I currently have, still wondering if it'd work.
> > > >
> > > > 1) When the onMessage() callback is triggered, create a consumer a
> pull
> > > > more messages to process from the queue where the message was
> delivered
> > > > from.
> > > > 2) Pull messages upto the number of my max chunk size, or upto the
> > > messages
> > > > available in the queue.
> > > > 3) process all the messages together and commit on the session.
> > > >
> > > > I'd like to hear ideas on how to go about this.
> > > >
> > > > Thanks,
> > > > --
> > > > -Praveen
> > > >
> > >
> >
> >
> >
> > --
> > -Praveen
> >
>



-- 
-Praveen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message