brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svetoslav Neykov <svetoslav.ney...@cloudsoftcorp.com>
Subject [PROPOSAL] Controlling effectors concurrency
Date Wed, 11 Jan 2017 12:08:21 GMT
## Problem

The current model in Brooklyn for executing effectors is to do it in parallel, without regard
for already running instances of the same effector. This makes writing certain classes of
YAML blueprints harder - use-cases which need to limit the number of concurrent executions.
Currently this gets worked around on per-blueprint basis, shifting the burden of synchronizing/locking
to the blueprint which have limited means to do it.

Some concrete examples:
  * A haproxy blueprint which needs to have at most one "update configuration" effector running
- solved in bash by using flock
    https://github.com/brooklyncentral/clocker/blob/9d3487198f426e8ebc6efeee94af3dc50383fa71/common/catalog/common/haproxy.bom
  * Some clusters have a limit on how many members can join at a time (Cassandra notably)
  * A DNS blueprint needs to make sure that updates to the records happen sequentially so
no records get lost
  * To avoid API rate limits in certain services we need to limit how many operations we do
at any moment - say we want to limit provisioning of entities, but not installing/launching
them.

A first step in solving the above has been made in https://github.com/apache/brooklyn-server/pull/443
which adds "maxConcurrentChildCommands" to the DynamicCluster operations (start, resize, stop).
This allows us to limit how many entities get created/destroyed by the cluster in parallel.
The goal of this proposal is to extend it by making it possible to apply finer grained limits
(say just on the launch step of the start effector) and to make it more general (not just
start/stop in cluster but any effector).

## Proposed solution

Add functionality which allows external code (e.g. adjuncts) to plug into the lifecycle of
entities **synchronously** and influence their behaviour. This will allow us to influence
the execution of effectors on entities  and for this particular proposal to block execution
until some condition is met.

## Possible approaches (alternatives)

### Effector execution notifications

Provide the functionality to subscribe callbacks to be called when an effector is about to
execute on an entity. The callback has the ability to mutate the effector, for example by
adding a wrapper task to ensure certain concurrency limits. A simpler alternative would be
to add pre and post execution callbacks. For this to be useful we need to split big effectors
into smaller pieces. For example the start effectors will be a composition of provision, install,
customize, launch effectors.
The reason not to work at the task level is that tasks are anonymous so we can't really subscribe
to them. To do that we'd need to add identifiers to them which essentially turns them into
effectors.

### Add hooks to the existing effectors

We could add fixed pre and post hooks to the start/stop effectors which execute callbacks
synchronously at key points around tasks.

--

Both of the above will allow us to plug additional logic into the lifecycle of entities, making
it possible to block execution. For clusters we'd plug into the members' lifecycle and provide
cluster-wide limits (say a semaphore shared by the members). For more complex scenarios we
could name the synchronising entity explicitly, for example to block execution until a step
in a separate entity is complete (say registering DNS records after provisioning but before
launch application-wide).

## Examples

Here are some concrete examples which give you a taste of what it would look like (thanks
Geoff for sharing these)


### Limit the number of entities starting at any moment in the cluster (but provision them
in parallel)
services:
- type: cluster
  brooklyn.enrichers:
### plugs into the lifecycle provided callbacks and limits how many tasks can execute in parallel
after provisioning the machines
### by convention concurrency is counted down at the last stage if not explicitly defined
  - type: org.apache.brooklyn.enricher.stock.LimitGroupTasksSemaphore
    brooklyn.config:
      stage: post.provisioning
      parallel.operation.size: auto # meaning the whole cluster; or could be integer e.g.
10 for 10-at-a-time
  brooklyn.config:
    initialSize: 50
    memberSpec:
      $brooklyn:entitySpec:
        type: cluste-member



---


### Use an third entity to control the concurrency
brooklyn.catalog:
  items:
  - id: provisionBeforeInstallCluster
    version: 1.0.0
    item:
      type: cluster
      id: cluster
      brooklyn.parameters:
      - name: initial.cluster.size
        description: Initial Cluster Size
        default: 50
      brooklyn.config:
        initialSize: $brooklyn:config("initial.cluster.size")
        memberSpec:
          $brooklyn:entitySpec:
            type: cluster-member
            brooklyn.enrichers:
            - type: org.apache.brooklyn.enricher.stock.AquirePermissionToProceed
              brooklyn.config:
                stage: post.provisioning
### Delegate the concurrency decisions to the referee entity
                authorisor: $brooklyn:entity("referee")
      brooklyn.children:
      - type: org.apache.brooklyn.entity.TaskRegulationSemaphore
        id: referee
        brooklyn.config:
          initial.value: $brooklyn:entity("cluster").config("initial.cluster.size") # or 1
for sequential execution


---

Some thoughts from Alex form previous discussions on how it would look like in YOML with initd-style
effectors:

I’d like to have a semaphore on normal nodes cluster and for the ⁠⁠⁠⁠launch⁠⁠⁠⁠
step each node acquires that semaphore, releasing when confirmed joined.  i could see a task
you set in yaml eg if using the initdish idea

035-pre-launch-get-semaphore: { acquire-semaphore: { scope: $brooklyn:parent(), name: "node-launch"
} }
040-launch: { ssh: "service cassandra start" }
045-confirm-service-up: { wait: { sensor: service.inCluster, timeout: 20m } }
050-finish-release-semaphore: semaphore-release

tasks of type ⁠⁠⁠⁠acquire-semaphore⁠⁠⁠⁠ would use (create if needed)
a named semaphore against the given entity … but somehow we need to say when it should automatically
be released (eg on failure) in addition to explicit release (the ⁠⁠⁠⁠050⁠⁠⁠⁠
which assumes some scope, not sure how/if to implement that)

---

Thanks to Geoff who shared his thoughts on the subject, with this post based on them.

Svet.





Mime
View raw message