stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Hall (michaha2)" <micha...@cisco.com>
Subject Re: autoscale architecture
Date Fri, 13 Feb 2015 10:44:07 GMT
Hi Imesh,

So ‘transistion compensated’ refers to cartridges, which are ’transistioning’ between
SPAWNED-ACTIVE, and TERMINATING-TERMINATED.

What it really means, is that if the 'aggregated average’ (Referred to this as <metric>PredictedValue
in scaling.drl) is compensated:

  1.  As if the ‘spawning’ cartridges are providing resouce (although they aren’t yet)
  2.  As if the ‘terminating’ cartridges have removed resource (although they haven't
yet)

Such that the ‘transition compensated aggregated average', will be approximately what the
actually aggregated average would be if those cartridges had become fully ‘active’ or
‘terminated’. This means the ‘transition compensated aggregated average’ is always
in a sensible state to make a scaling decision.

This then allows us to make a scaling decision as often as we’d like (much smaller than
90 seconds, could even be every 1 second), because if you take the example the we’ve scaled
up, the 'transition compensated aggregated average’ will instantly adjust to N/N+1 of it’s
raw value (copied formula from previous email for reference below), so another scaling decision
will only occur, if the underlying load (aggregated average) increases even further.

transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size
- cluster–terminating-size )

I’d be more than happy to setup a webex meeting to try and explain this better? Or another
avenue of communication at your preference?

Kind regards,

Mike

From: Imesh Gunaratne <imesh@apache.org<mailto:imesh@apache.org>>
Reply-To: "dev@stratos.apache.org<mailto:dev@stratos.apache.org>" <dev@stratos.apache.org<mailto:dev@stratos.apache.org>>
Date: Friday, 13 February 2015 01:09
To: dev <dev@stratos.apache.org<mailto:dev@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Mike,

Thanks for the detailed explanation of your question. Currently we do not have the capability
to do this in runtime for a specific cartridge. However we could reduce the global scaling
decision interval. This needs to be configured at three locations:

1. Cartridge agent statistics publishing interval (default: 15 seconds)
2. CEP execution plan/faulty member detection interval (default: 1 min)
3. Autoscaler cluster monitor interval (default: 90 seconds)

I did not clearly get what you mean by 'transition compensated'. Is there a way to explain
it further?

Thanks


On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <michaha2@cisco.com<mailto:michaha2@cisco.com>>
wrote:
Hi Dev,

Thanks for your response Imesh, if its ok, I’d like to skip straight to my (rather lengthy)
question:

Does the autoscaler have, currently or plans to introduce, a means to receive an asynchronous
event, signalling that a cartridge has gone from ‘SPAWNED’ to ‘ACTIVE’, after it is
launched from a 'scale-up’ decision, so that, scaling decision interval can decrease to
approximately the metric update interval, and multiple cartridges are not spawned when only
one is needed?

In more depth:

The reasons for my question being that by knowing a cartridge is in the ‘SPAWNED’ or ’TERMINATING’
state, the aggregated metric averages can be ’transition compensated’ I.e…
transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size
- cluster–terminating-size )
To allow the scaling decisions to occur on a continuous (only throttled by the metric update
frequency) basis.

It appears that currently scaling decision occurs ~minutes. If this becomes ~seconds, it would
vastly improving the maximum rate of ascent a cluster can scale against sudden increase in
load.

It appears that there is no spawning state awareness, which also means several ‘redundant’
instances get spawned, when instance startup time is greater than the scale decision interval.

Finally:

Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per cartridge
basis, how does this align (if its a valid enhancement) with other potential improvements
that could be made to the autoscaler?

Regards,

Mike

From: Imesh Gunaratne <imesh@apache.org<mailto:imesh@apache.org>>
Reply-To: "dev@stratos.apache.org<mailto:dev@stratos.apache.org>" <dev@stratos.apache.org<mailto:dev@stratos.apache.org>>
Date: Thursday, 12 February 2015 18:16
To: dev <dev@stratos.apache.org<mailto:dev@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Michael,

Yes you can ask any questions you have on Autoscaling here.

I don't think we have documented Autoscaling feature in 4.1.0 at the moment. However you could
find some information here [1]. Autoscaling has slightly changed with Composite Application
Model.

[1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler

Thanks

On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <michaha2@cisco.com<mailto:michaha2@cisco.com>>
wrote:
Hi Devs,

Is there a resource or contact that can help me understand the current, and planned architecture
of the autoscaling feature within Stratos.

Best Regards,

Mike



--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos



--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Mime
View raw message