activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donnchadh Ó Donnabháin (JIRA) <j...@apache.org>
Subject [jira] [Commented] (AMQ-873) Add dynamic 'prefetch window' management ...
Date Tue, 22 May 2012 15:44:42 GMT

    [ https://issues.apache.org/jira/browse/AMQ-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281046#comment-13281046
] 

Donnchadh Ó Donnabháin commented on AMQ-873:
--------------------------------------------

See https://queue.acm.org/detail.cfm?id=2209336 and http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/
.
                
> Add  dynamic 'prefetch window' management ...
> ---------------------------------------------
>
>                 Key: AMQ-873
>                 URL: https://issues.apache.org/jira/browse/AMQ-873
>             Project: ActiveMQ
>          Issue Type: Improvement
>            Reporter: Sridhar Komandur
>
> Creating this item to track as a separate item, per discussions with James. I am capturing
the latest discussion related to this (on AMQ-850) below:
> ------
> James, Thanks for taking time to discuss this issue. Please see below:
> On 8/10/06, James Strachan <james.strachan@...> wrote:
> >
> > On 8/10/06, Komandur <sridharkomandur@...> wrote:
> > >
> > > >> 1. can we use an 'elastic prefetch' buffer based on a sliding window
> > (like
> > > >> in TCP)  - this reacts to client (mis)behavior
> > >
> > > >We could start with a prefetch of 1 and increase it over time for well
> > > >behaving clients. However it doesn't fix the problem as a mis-behaving
> > > >consumer could still hog at least one message - though this would
> > > >reduce the imact from 1000 or so to 1.
> > >
> > > Note that the prefetch window needs to follow the standard tcp stuff
> > > of multiplicative decrease during problem period  & additive increase
> > upon
> > > positive ack (IMHO,
> > > there isn't much to be gained in reinventing the TCP flow control wheel,
> > > which has been
> > > honed for over a decade.)
> >
> > The problem is - once a message has been sent to a consumer its too
> > late - the consumer is now hogging it. This differs considerably with
> > TCP - in TCP it doesn't affect other connections if you send a little
> > too much data to a socket.
>  TCP takes the perspective of end-end - in a way we can  think of it as a
> messaging layer
> spanning both the sender and the receiver.
> We can take a similar approach, the broker and the clientside Activemq
> subsystem can
> work together to achieve our flow control goals. The activemq subsystem on
> the consumer side,
> as long as it is not actually delivered, can always reclaim it from the
> prefetch buffer (when the window is shrunk). In effect, we have a 'proxy'
> flow control system on the consumer side which is in tune with the
> brokerside.
> > This helps in several ways:
> > >
> > > - Messages are dispatched as soon as possible, as slow consumer will
> > > automatically have a smaller 'prefetch window'. In fact by decaying the
> > > 'prefetch window' (like in the latest implementations
> > > of TCP flow control), a new slow consumer's window automatically
> > shrinks.
> >
> > Growing and shriking the prefetch windows based on the amount of time
> > it takes to get acknowledgements back is certainly possible - though
> > its a different discussion and is for different reasons as it purely
> > tunes the prefetch size to their optimal level. This also assumes that
> > you can actually grow and shrink them accurately. e.g. the prefetch
> > buffer sizes may need to be large for performance reasons when some
> > messages take a long time to process or when networks are slow. So
> > adding automatically sized prefetch windows could result in windows
> > being too small.
> James, you have a valid concern above with respect to slow response. This
> is another of the instances where TCP flow control works effectively. It is
> always striving to send 'bandwidth * delay' amount of data outstanding, to
> keep
> the receiving from starving due to slow response (refer to the IETF RFC on
> long thing networks). Note that a consumer side proxy logic allows us to
> take advantage
> of asymmetry (the proxy is able to track the consumer activity, without the
> variance introduced by network) to suit our needs.
> However AMQ-850 is about a completely different problem to sizing the
> > prefetch buffer - its what to do about a badly behaving consumer.
> >
> >
> > > - I am not sure I understand the  'one message hog' case.
> >
> > Start with a prefetch of 1. Give a consumer a message then if the
> > consumer doesn't do anything with it - or locks up while processing
> > it. then that message is now 'hogged' - no other consumer can get the
> > message until the consumer is closed or the client killed.
> >
> >
> > > Most of the
> > > consumers are idempotent (there are many failure cases to count on 'once
> > and
> > > only once' delivery). So there is no harm in redelivering this one
> > message
> > > for which no ack has been received yet.
> >
> > That 1 message will not be delivered to anyone else - which is a real
> > problem. There's the added effect on ordering too.
> >
> >
> > > >> 2. When the broker detects a misbehaving client, reclaim the unAcked
> > > >> messages for other active consumers (and make the window size 0 or
1
> > in
> > > >> step
> > > >> 1 above)
> > >
> > > >If a client/connection misbehaves (e.g. becomes inactive) then the
> > > >connection is closed and all consumers are closed too causing all
> > > >their unacked messages to be redelivered.
> > >
> > > This sounds good. However, please note that misbehavior is not
> > necessarily a
> > > binary state.
> > > Sometimes an ACK could be delayed for many reasons (either transient
> > > consumer (mis) behavior or other network related issues). It is in the
> > gray
> > > areas that the tcp flow control works really well.
> >
> > Agreed - which is why AMQ-850 is introduced to allow people to set an
> > inactivity timer on specific consumers. It could just be 1 thread
> > which is blocked on some lock - while the other threads and the rest
> > of the connection is working fine.
> >
> > --
> >
> > James
> > ------- 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message