activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <>
Subject Re: ActiveMQ 6.0 Broker Core Prototype -- Flow Control / Memory Management
Date Thu, 11 Jun 2009 08:11:45 GMT
Hi Colin,

In 5.x flow control behaves as if its binary - off or on. When its off  
- messages can be offlined (for non-persistent messages this means  
being dumped to temporary storage) - but when its on - the producers  
slow and stop.
Also - there can be cases when you get a temporary slow consumer (the  
consuming app may be doing a big gc) - which means with flow control  
off - messages get dumped to disk - and then the producers may never  
slow down enough again for the consumer to catch up. Flow control is  
difficult to implement for all cases - but we should allow for  
configuration of the following:

* maximum overall broker memory
* maximum memory allocation per destination
* maximum storage allocation
* maximum storage allocation per destination
* maximum temporary storage allocation
* maximum temporary storage allocation per destination

when we start to hit a resource limit - we should aggressively gc  
messages that have expired, then either offline (an flow control when  
that limit is hit) or flow control.
It would be great to have a combined policy where we can block a  
producer for a short time (seconds) then offline
For non-persistent messages - we still need a policy where we can  
remove messages based on a selector (which would be in addition to  
expiring messages).



On 10 Jun 2009, at 17:29, Colin MacNaughton wrote:

> Hi Everyone,
> As a follow on to my e-mail last week introducing the core broker
> prototype that Hiram and I have been working on, I wanted to spin up a
> thread on the flow control model that we're using.
> I'd be interested to hear in your thoughts on current shortcomings
> associated with flow control / memory management in 5.3 so we can make
> sure that the use cases are covered. Beyond that any additional  
> input on
> the design or implementation would be great ... are we on the right
> track?
> Cheers,
> Colin
> The text below is taken straight from the webgen in the project, my
> apologies if it's a little verbose!
> As a reminder the bits can be found at:
> The activemq-flow package is meant to be a standalone module that  
> deals
> generically with Resources and Flow's of elements that flow through  
> and
> between them. The current implementation is designed with the  
> following
> goals in mind:
>    * SIMPLE: Want a fairly simple and consistent model for controlling
> flow of messages and other data in the system to control memory and  
> disk
> space. The module must be able to handle fan-in/fan-out as well as
> simpler 1 to 1 cases.
>    * PERFORMANT: The flow control mechanism must be performant and
> should not introduce much overhead in cases where downstream resources
> are able to keep up.
>    * MODULARIZED: The module should be independent generic and
> reusable.
>    * FAIRNESS: We should be able to provide better fairness. If I've
> got several producers putting messages on a queue, the flow controller
> should not prefer one source over the other (unless configured to do  
> so)
>    * VISIBILITY: With a unified model in place we can instrument it to
> provide visibility in the product (e.g. a visual graph of flows in the
> system). When a customer says that they are not using PERSISTENT
> messages yet we see 1000msgs/sec flowing through the recovery log....
>    * ADMINISTRATION: We can explore the possibility of  
> administratively
> limiting message flows. E.g. I've done my production stress testing  
> and
> can successfully handle my anticipated load of 4000 msgs/sec on topic1
> ... I'd prefer to avoid the case where publishers go berserk and
> overload my backend with messages).
>    * POLICIES: We should be able to better instrument general flow
> control policies. E.g. I want to tune for latency or throughput. If a
> subscriber gets behind, I'd like the policy for messages on topic1  
> to be
> that I drop the oldest messages instead of initiating flow control.
> The Basics:
> Each resource creates a FlowController for each of it's Flows which is
> assigned a corresponding FlowLimiter. As elements (e.g. messages) pass
> from one resource to another they are passed through the downstream
> resource's FlowController which updates its Limiter. If propagation of
> an element from one resource to another causes the downstream  
> limiter to
> become throttled the associated FlowController will block the source  
> of
> the element. The flow module is used heavily by the rest of the core  
> for
> memory and disk management.
>    * Memory Management: Memory is managed based on the resources in
> play -- the usage is computed by summing of the space allocated to  
> each
> of the resources' limiters. This strategy intentionally avoids a
> centralized memory limit which leads to complicated logic to track  
> when
> a centralized limiter needs to be decremented and avoids contention
> between multiple resources/threads accessing the limiter and also
> reduces the potential for memory limiter related deadlocks. However,  
> it
> should be noted that this approach doesn't preclude implementing
> centralized limiters in the future.
>    * Flow Control: As messages propagate from one resource A to  
> another
> B, then if A overflows B's limit, B will block A and A can't release
> it's limiter space until B unblocks it. This allowance for overflow  
> into
> downstream resources is a key concept in flow control performance and
> ease of use. Provided that the upstream resource has already accounted
> for the message's memory it can freely overflow any downstream limiter
> providing it reserves space from elements that caused overflow.
>    * Threading Model: Note that as a message propagates from A to B,
> that the general contract is that A won't release it's memory if B
> blocks it during the course of dispatch. This means that it is not  
> safe
> to perform a thread handoff during dispatch between two resources  
> since
> the thread dispatching A relies on the message making it to B (so  
> that B
> can block it) prior to A completing dispatch.
>    * Management/Visibility: Another intended use of the activemq-flow
> module is to assist in visibility e.g. provide an underlying map of
> resources that can be exposed via tooling to see the relationships
> between sources and sinks of messages and to find bottlenecks ... this
> aspect has been downplayed for now as we have been focusing more on  
> the
> queueing/memory management model in the prototype, but eventually the
> flow package itself will provide a handy way of providing visibility  
> in
> the system particularly in terms of finding performance bottlenecks.
> FlowResource (FlowSink and FlowSource): A container for  
> FlowControllers
> providing some lifecycle related logic. The base resource class  
> handles
> interaction/registration with the FlowManager (below).
> FlowManager: Registry for Flow's and FlowResources. The manager will
> provide some hooks into system visibility. As mentioned above this
> aspect has been downplayed somewhat for the present time.
> FlowController: Wraps a FlowLimiter and actually implements common  
> basic
> block/resume logic between FlowControllers.
> FlowLimiter: Defines the limits enforced by a FlowController.  
> Currently
> the package has size based limiter implementations, but eventually
> should also support other common limiter types such as rate based
> limiters. The limiter's are also extended at other points in the  
> broker
> (for example implementing a protocol based WindowLimiter). It is also
> likely that we would want to introduce CompositeLimiters to combine
> various limiter types.
> Flow: The concept of a flow is not used very heavily right now. But a
> Flow defines the stream of elements that can be blocked. In general  
> the
> prototype creates a single flow per resource, but in the future a  
> source
> may break it's elements down into more granular flows on which
> downstream sinks may block it. One case where this is anticipated as
> being useful is in networks of brokers where-in it may be desirable to
> partition messages into more granular flows (e.g based on producer or
> destination) to avoid blocking the broker-broker connection
> uncessarily).

View raw message