openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyson Norris <tnor...@adobe.com.INVALID>
Subject Re: Enablement of controller clustering
Date Mon, 25 Sep 2017 19:12:39 GMT
FWIW I tested the current impl for #2531 and can get to a state of cluster failures by:
- starting controller with defaults (2 instances, clustering enabled - cluster is created,
good so far)
- kill 1 controller instance (e.g. controller1; controller0 starts to report: "Association
with remote system [akka.tcp://controller-actor-system@192.168.99.100:8001] has failed, address
is now gated for [5000] ms.” Indefinitely)
- update ansible scripts so that 2 new controller instances will start, using different ports
- start controller (should launch controller2, controller3), and controller0 reports: "Leader
can currently not perform its duties, reachability status: [akka.tcp://controller-actor-system@192.168.99.100:8000
-> akka.tcp://controller-actor-system@192.168.99.100:8001: Unreachable [Unreachable] (1)]”


As far as I know this is expected behavior of akka clustering. Per docs "When a member is
considered by the failure detector to be unreachable the leader is not allowed to perform
its duties, such as changing status of new joining members to ‘Up’. “

Now there is no argument that NGINX will stop sending traffic there, but the state of the
cluster at this point is unrecoverable IFF the same IP/PORT is not used to start new controller
instances. 

Ideally, this is a good reason to make the behavior in these cases extensible such that “dynamic
clusters” can make an attempt to do the right thing by a) leveraging some external source
of truth about the state of cluster nodes and b) transferring those cluster nodes that are
deemed “down” to the proper state so that akka clustering can resume. (Or alternatively
use a separately dedicated set of seed nodes as Brendan mentioned) 

In the case that this extensibility doesn’t exist (that’s fine, it can be added later),
I would recommend that this behavior is carefully documented since AFAIK this is the first
occasion that the clustering system is enabled and it is requiring specialized handling of
IP and PORT assignments (for controller) that were not previously there. 

This will of course be more problematic for anyone not using ansible to deploy controllers,
but has nothing to do directly with mesos or kube and instead is purely an aspect of akka
clustering that could be potentially bad if the clustering feature is leveraged when IP/PORT
is not maintained across restarts or cluster membership changes.

I know there are ways to solve this with various deployments, but as Ben mentions and I agree
with, it would be good to not require this stability in hostname/port at least for controller
(and invoker), and aim to instead get them to behave correctly without this stability. 

If there are questions on this, maybe we can discuss on the call tomorrow, or else I am happy
to join a call but my mornings are already booked this week, so my earliest available is Monday.

Thanks
Tyson






> On Sep 25, 2017, at 4:47 AM, Vadim Raskin <raskinvadim@gmail.com> wrote:
> 
> I think that the main concern here is “how do we down controller cluster
> nodes given different deployment models”. I like the suggestion of Brendan,
> where we could setup a list of static DNS entries. Or we configure a single
> seedNode to be resolved to nginx or its static ip that will dispatch the
> requests to the available controller nodes. Here we can mitigate the case
> when a failing node is not reachable from both nginx and other cluster
> nodes, e.g. JVM is killed. So the load balancer (e.g. nginx) won’t be able
> to reach it, and it will be marked as failing and no requests will be
> dispatched in the next 30s (current config setup). So in this case it would
> make sense to configure the auto-downing feature of akka. This will make
> sure the failing nodes will join the right controller node, e.g. it won't
> join itself.
> I’m considering it from the ansible perspective, probably you could say
> more on mesos/kube approach. I think that we need to discuss the current
> approach and define how we are going to tackle the edge cases.
> 
> If you think that there are some other edge cases or we need more clarity
> on the current clustering configuration I’d be happy setup a call to
> discuss it.
> I think that Tyson, Brendan, Dragos, Markus, Rodric could be good
> candidates for this call, however, others are also welcome.
> 
> What do you think?
> 
> regards, Vadim.
> 
> 
> On Fri, Sep 22, 2017 at 8:11 PM Ben Browning <bbrownin@redhat.com> wrote:
> 
>> In Kubernetes and OpenShift, we'd use StatefulSets to give stable hostnames
>> for the controllers (or at least controller seed nodes). The IPs may change
>> when a node dies and gets replaced, but the hostnames would be stable as
>> controller-0.xyz, controller-1.xyz, controller-2.xyz, etc.
>> 
>> It would be ideal if we didn't need stable hostnames or IPs, but I believe
>> CouchDB, Zookeeper and Kafka will have to be treated similarly for their
>> underlying clustering mechanisms to work as expected.
>> 
>> Ben
>> 
>> 
>> On Fri, Sep 22, 2017 at 12:53 PM, Tyson Norris <tnorris@adobe.com.invalid>
>> wrote:
>> 
>>> Thanks Vadim!
>>> 
>>> A couple comments:
>>> - just to be clear: this is leveraging Akka Clustering (not just Akka
>>> Remoting)
>>> - I’m interested to hear if "deployment models where controller
>>> container’s IP changes upon the restart” is actually an edge case (it is
>>> not for us)
>>> - I’m not an Akka or Akka Cluster expert, but we’ve been testing Akka
>>> clustering (separate from OW) this and had problems in these cases due to
>>> dynamic IPs, where it has required logic to explicitly down the nodes to
>>> return to normal operation after a failure; (would like to hear from any
>>> Akka/Cluster experts on this topic!)
>>> 
>>> IMHO, this is often NOT an edge case, and as such, until the impl is more
>>> flexible (to allow how seed nodes are defined and downing is handled),
>> then
>>> the default should be to NOT enable this.
>>> 
>>> For example, in mesos, we will not predict the IP address of the
>>> controller at restart, so this will lead to unreachable nodes list that
>> is
>>> never cleared without manual intervention.
>>> 
>>> I mentioned this would be OK (as a first step, to require manual
>>> intervention), but I think the default should be to disable this
>> clustering
>>> until it can be handled for various deployment scenarios, and in the
>>> meantime, if people do want to enable this for the “dynamic IP” scenario,
>>> there needs to be documentation to indicate exactly what steps need to be
>>> take to handle downing, and what the risks are of NOT doing this.
>>> 
>>> Of course this could be seen as "just a matter of defaults”, so its not
>>> technically a big difference to enable it by default (vs disabled), but I
>>> would err on the side that will produce the best results for more
>> operators.
>>> 
>>> WDYT?
>>> 
>>> Thanks
>>> Tyson
>>> 
>>>> On Sep 22, 2017, at 9:00 AM, Vadim Raskin <raskinvadim@gmail.com>
>> wrote:
>>>> 
>>>> Hi everyone,
>>>> (sorry if dup, had some issues with mail delivery)
>>>> 
>>>> just wanted to give a small introduction to a piece of work which is
>>>> currently ongoing in the field of controller scale out. In order to
>>> enable
>>>> several active controller instances running simultaneously we introduce
>>>> controller clustering, whose main purpose is to share the controller
>>>> bookkeeping information, e.g. activations per invoker and activations
>> per
>>>> namespace. Under the hood we use Akka Remoting, which showed good
>>> behaviour
>>>> with no regression in our test environments. The introduction of this
>>>> feature alone should not change the external behaviour of controllers
>>>> unless the routing to more then one controller is explicitly enabled.
>>>> 
>>>> The next recommended steps after the clustering goes into the master:
>>>> - keep two controllers deployed as before in an active-passive mode
>> with
>>>> clustering enabled, let controllers replicate their data meanwhile
>>>> collecting operational experience.
>>>> - scale out the number of controller nodes, enable active-active mode
>> in
>>>> the upfront loadbalancer.
>>>> 
>>>> A couple of things to keep in mind:
>>>> * this change comes with a feature toggle, which means you could easily
>>>> turn off clustering by setting a controllerLocalBookkeeping in your
>>>> deployment. This is more appropriate for the first phase when only one
>>>> controller is active.
>>>> * there could be certain edge cases where clustering would require a
>>>> special treatment in case of deployment models where controller
>>> container's
>>>> IP changes upon the restart. Say if one controller has failed and
>> joined
>>>> the cluster as a new member, there will be some garbage accumulated in
>>> the
>>>> list of cluster members. It is not harmful per se, e.g. the cluster is
>>>> still running, however healthy cluster nodes will be still gossiping
>>> with a
>>>> non-existing container. If assigning static IP addresses is not an
>>> option,
>>>> in order to avoid this case one could use auto-downing feature in akka
>>>> cluster, which allows to a cluster leader to mark the failing node as
>>> down
>>>> and remove it from the cluster. To prevent cluster partitioning due to
>>>> several leaders this property must be set a relatively high value. The
>>>> number is not deterministic and could be defined based on the further
>> ops
>>>> experience.
>>>> 
>>>> If you have any feedback regarding this change, you could respond in
>> this
>>>> thread, ping me on slack or comment in this PR:
>>>> https://na01.safelinks.protection.outlook.com/?url=
>>> https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-
>>> openwhisk%2Fpull%2F2531&data=02%7C01%7C%7C53dd4bae8c49491e0c7b08d501d3
>>> 0688%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%
>>> 7C636416928221587729&sdata=OiNhlcwMf2G5VtlSq%2Fxp4z0Rf6bv64wQilCRehEbmMI%
>>> 3D&reserved=0
>>>> 
>>>> regards, Vadim Raskin.
>>> 
>>> 
>> 

Mime
View raw message