activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <rajdav...@gmail.com>
Subject Re: ActiveMQ 6.0 Broker Core Prototype -- Flow Control / Memory Management
Date Thu, 11 Jun 2009 20:00:18 GMT

On 11 Jun 2009, at 17:06, Colin MacNaughton wrote:

> Hi Rob,
>
> In terms of configurable maximums with respect to destinations:
> Maximum memory allocation per PTP queue is supported, but for topics  
> the
> limits are actually tied to the Subscriptions receiving the message.  
> This is
> because these are the objects that actually map to the underlying  
> cursored
> queues that hold the messages and do the paging/limiting. Does that  
> make
> sense?
Makes a lot of sense from an implementation point of view - but  
doesn't translate too well for users to understand.
>
>
> In terms of overall disk/memory limits. The approach we're taking  
> would not
> explicitly define a single overall limit (at least not initially) --  
> rather
> the total maximum is based on the resources you create. E.g. as a  
> user you
> need to know how many queues and subscriptions you create and plan for
> memory/disk accordingly. This doesn't preclude trying to enforce  
> global
> limits later, but in my opinion doing so complicates the  
> implementation a
> fair amount, in terms of trying to intelligently balancing the  
> available
> space across the queues and also leads to additional contention on the
> shared limiter -- and worse can lead to resource related deadlocks  
> if we get
> it wrong. We can still do things like limit the maximum number/size of
> subscriptions/queues/connections etc.
Again - from an implementation point of view makes a lot of sense -  
but makes it difficult for users - who will have to pre-dimension the  
broker.
>
>
> Colin
>
> -----Original Message-----
> From: Rob Davies [mailto:rajdavies@gmail.com]
> Sent: Thursday, June 11, 2009 1:12 AM
> To: dev@activemq.apache.org
> Subject: Re: ActiveMQ 6.0 Broker Core Prototype -- Flow Control /  
> Memory
> Management
>
> Hi Colin,
>
> In 5.x flow control behaves as if its binary - off or on. When its off
> - messages can be offlined (for non-persistent messages this means
> being dumped to temporary storage) - but when its on - the producers
> slow and stop.
> Also - there can be cases when you get a temporary slow consumer (the
> consuming app may be doing a big gc) - which means with flow control
> off - messages get dumped to disk - and then the producers may never
> slow down enough again for the consumer to catch up. Flow control is
> difficult to implement for all cases - but we should allow for
> configuration of the following:
>
> * maximum overall broker memory
> * maximum memory allocation per destination
> * maximum storage allocation
> * maximum storage allocation per destination
> * maximum temporary storage allocation
> * maximum temporary storage allocation per destination
>
> when we start to hit a resource limit - we should aggressively gc
> messages that have expired, then either offline (an flow control when
> that limit is hit) or flow control.
> It would be great to have a combined policy where we can block a
> producer for a short time (seconds) then offline
> For non-persistent messages - we still need a policy where we can
> remove messages based on a selector (which would be in addition to
> expiring messages).
>
> cheers,
>
> Rob
>
> On 10 Jun 2009, at 17:29, Colin MacNaughton wrote:
>
>> Hi Everyone,
>>
>> As a follow on to my e-mail last week introducing the core broker
>> prototype that Hiram and I have been working on, I wanted to spin  
>> up a
>> thread on the flow control model that we're using.
>>
>> I'd be interested to hear in your thoughts on current shortcomings
>> associated with flow control / memory management in 5.3 so we can  
>> make
>> sure that the use cases are covered. Beyond that any additional
>> input on
>> the design or implementation would be great ... are we on the right
>> track?
>>
>> Cheers,
>> Colin
>>
>>
>> The text below is taken straight from the webgen in the project, my
>> apologies if it's a little verbose!
>>
>> As a reminder the bits can be found at:
>> https://svn.apache.org/repos/asf/activemq/sandbox/activemq-flow
>>
>> The activemq-flow package is meant to be a standalone module that
>> deals
>> generically with Resources and Flow's of elements that flow through
>> and
>> between them. The current implementation is designed with the
>> following
>> goals in mind:
>>
>>   * SIMPLE: Want a fairly simple and consistent model for controlling
>> flow of messages and other data in the system to control memory and
>> disk
>> space. The module must be able to handle fan-in/fan-out as well as
>> simpler 1 to 1 cases.
>>   * PERFORMANT: The flow control mechanism must be performant and
>> should not introduce much overhead in cases where downstream  
>> resources
>> are able to keep up.
>>   * MODULARIZED: The module should be independent generic and
>> reusable.
>>   * FAIRNESS: We should be able to provide better fairness. If I've
>> got several producers putting messages on a queue, the flow  
>> controller
>> should not prefer one source over the other (unless configured to do
>> so)
>>   * VISIBILITY: With a unified model in place we can instrument it to
>> provide visibility in the product (e.g. a visual graph of flows in  
>> the
>> system). When a customer says that they are not using PERSISTENT
>> messages yet we see 1000msgs/sec flowing through the recovery log....
>>   * ADMINISTRATION: We can explore the possibility of
>> administratively
>> limiting message flows. E.g. I've done my production stress testing
>> and
>> can successfully handle my anticipated load of 4000 msgs/sec on  
>> topic1
>> ... I'd prefer to avoid the case where publishers go berserk and
>> overload my backend with messages).
>>   * POLICIES: We should be able to better instrument general flow
>> control policies. E.g. I want to tune for latency or throughput. If a
>> subscriber gets behind, I'd like the policy for messages on topic1
>> to be
>> that I drop the oldest messages instead of initiating flow control.
>>
>> The Basics:
>>
>> Each resource creates a FlowController for each of it's Flows which  
>> is
>> assigned a corresponding FlowLimiter. As elements (e.g. messages)  
>> pass
>> from one resource to another they are passed through the downstream
>> resource's FlowController which updates its Limiter. If propagation  
>> of
>> an element from one resource to another causes the downstream
>> limiter to
>> become throttled the associated FlowController will block the source
>> of
>> the element. The flow module is used heavily by the rest of the core
>> for
>> memory and disk management.
>>
>>   * Memory Management: Memory is managed based on the resources in
>> play -- the usage is computed by summing of the space allocated to
>> each
>> of the resources' limiters. This strategy intentionally avoids a
>> centralized memory limit which leads to complicated logic to track
>> when
>> a centralized limiter needs to be decremented and avoids contention
>> between multiple resources/threads accessing the limiter and also
>> reduces the potential for memory limiter related deadlocks. However,
>> it
>> should be noted that this approach doesn't preclude implementing
>> centralized limiters in the future.
>>   * Flow Control: As messages propagate from one resource A to
>> another
>> B, then if A overflows B's limit, B will block A and A can't release
>> it's limiter space until B unblocks it. This allowance for overflow
>> into
>> downstream resources is a key concept in flow control performance and
>> ease of use. Provided that the upstream resource has already  
>> accounted
>> for the message's memory it can freely overflow any downstream  
>> limiter
>> providing it reserves space from elements that caused overflow.
>>   * Threading Model: Note that as a message propagates from A to B,
>> that the general contract is that A won't release it's memory if B
>> blocks it during the course of dispatch. This means that it is not
>> safe
>> to perform a thread handoff during dispatch between two resources
>> since
>> the thread dispatching A relies on the message making it to B (so
>> that B
>> can block it) prior to A completing dispatch.
>>   * Management/Visibility: Another intended use of the activemq-flow
>> module is to assist in visibility e.g. provide an underlying map of
>> resources that can be exposed via tooling to see the relationships
>> between sources and sinks of messages and to find bottlenecks ...  
>> this
>> aspect has been downplayed for now as we have been focusing more on
>> the
>> queueing/memory management model in the prototype, but eventually the
>> flow package itself will provide a handy way of providing visibility
>> in
>> the system particularly in terms of finding performance bottlenecks.
>>
>> FlowResource (FlowSink and FlowSource): A container for
>> FlowControllers
>> providing some lifecycle related logic. The base resource class
>> handles
>> interaction/registration with the FlowManager (below).
>>
>> FlowManager: Registry for Flow's and FlowResources. The manager will
>> provide some hooks into system visibility. As mentioned above this
>> aspect has been downplayed somewhat for the present time.
>>
>> FlowController: Wraps a FlowLimiter and actually implements common
>> basic
>> block/resume logic between FlowControllers.
>>
>> FlowLimiter: Defines the limits enforced by a FlowController.
>> Currently
>> the package has size based limiter implementations, but eventually
>> should also support other common limiter types such as rate based
>> limiters. The limiter's are also extended at other points in the
>> broker
>> (for example implementing a protocol based WindowLimiter). It is also
>> likely that we would want to introduce CompositeLimiters to combine
>> various limiter types.
>>
>> Flow: The concept of a flow is not used very heavily right now. But a
>> Flow defines the stream of elements that can be blocked. In general
>> the
>> prototype creates a single flow per resource, but in the future a
>> source
>> may break it's elements down into more granular flows on which
>> downstream sinks may block it. One case where this is anticipated as
>> being useful is in networks of brokers where-in it may be desirable  
>> to
>> partition messages into more granular flows (e.g based on producer or
>> destination) to avoid blocking the broker-broker connection
>> uncessarily).
>>
>>
>
>


Mime
View raw message