openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Thoemmes" <markus.thoem...@de.ibm.com>
Subject Re: Proposal on a future architecture of OpenWhisk
Date Wed, 18 Jul 2018 17:16:37 GMT
Hi Dominic,

thanks for your feedback, let's see...

>1. Buffering of activation and failure handling.
>
>As of now, Kafka acts as a kind of buffer in case activation
>processing is
>a bit delayed due to some reasons such as invoker failure.
>If Kafka is only used for the overflowing case, how can it guarantee
>"at
>least once" activation processing?
>For example, if a controller receives the requests and it crashed
>before
>the activation is complete. How can other alive controllers or the
>restored
>controller handle it?

In the proposal, the semantics of talking to the container do not change from what we have
today. If the http request fails for whatever reason while "in-flight", processing cannot
be completed, just like if an invoker crashes. Note that we commit the message immediatly
after reading it today, which leads to "at-most-once" semantics. OpenWhisk does not support
"at-least-once" today. To do so, you'd retry from the outside. 

>2. A bottleneck in ContainerManager.
>
>Now ContainerManage has many critical logics.
>It takes care of the container lifecycle, logging, scheduling and so
>on.
>Also, it should be aware of whole container state as well as
>controller
>status and distribute containers among alive controllers.
>It might need to do some health checking to/from all controllers and
>containers(may be orchestrator such as k8s).
>
>I think in this case ContainerManager can be a bottleneck as the size
>of
>the cluster grows.
>And if we add more nodes to scale out ContainerManager or to prepare
>for
>the SPOF, then all states of ContainerManager should be shared among
>all
>nodes.
>If we take master/slave approach, the master would become a
>bottleneck at
>some point, and if we take a clustering approach, we need some
>mechanism to
>synchronize the cluster status among ContainerManagers.
>And this procedure should be done in 1 or 2 order of magnitude in
>milliseconds.
>
>Do you have anything in your mind to handle this?

I don't believe the ContainerManager needs to do that much honestly. In the Kubernetes case
for instance it only asks the Kube API for pods and then keeps a list of these pods per action.
Further it divides this list whenever a new container is added/removed. I think we can push
this quite far in a master/slave fashion as Brendan mentioned. This is guessing though, it'll
be crucial to measure the throughput that one instance actually can provide and then decide
on whether that's feasible or not.

As its state isn't moving at a super fast pace, we can probably afford to persist it into
something like redis or etcd for the failover to take over if one dies.

Of course I'm very open for scaling it out horizontally if that's achievable.

>3. Imbalance among controllers.
>
>I think there could be some imbalance among controllers.
>For example, there are 3 controllers with 3, 1, and 1 containers
>respectively for the given action.
>In some case, 1 containers in the controller1 might be overloaded but
>2
>containers in controller2 can be available.
>If the number of containers for the given action belongs to each
>controller
>varies, it could happen more easily.
>This is because controllers are not aware of the status of other
>controllers.
>So in some case, some action containers are overloaded but the others
>may
>handle just moderate requests.
>Then each controller may request more containers instead of utilizing
>existing(but in other controllers) containers, and this can lead to
>the
>waste of resources.

Assuming general round-robin (or even random scheduling) in front of the controllers would
even out things to a certain extent would they?

Another probably feasible solution is to implement session stickiness or hashing as you mentioned
in todays call. Another comment that was raised during today's would come into play there
as well: We could change the container division algorithm to not divide evenly but to only
give containers to those controllers that requested them. In conjunction with session stickiness,
that could yield better load distribution results (given the session stickiness is smart enough
to divide appropriately.

>4. How do controllers determine whether to create more containers?
>
>Let's say, a controller has only one container for the given action.
>How controllers recognize this container is overloaded and need more
>containers to create?
>If the action execution time is short, it can calculate the number of
>buffered activation for the given action.
>But the action execution time is long, let's say 1 min or 2 mins,
>then even
>though there is only 1 activation request in the buffer, the
>controller
>needs to create more containers.
>(Because subsequent activation request will be delayed for 1 or
>2mins.)
>Since we cannot know the execution time of action in advance, we may
>need a
>sort of timeout(of activation response) approach for all actions.
>But still, we cannot know how much time of execution are remaining
>for the
>given action after the timeout occurred.
>Further, if a user requests 100 or 200 concurrent invocations with a
>2
>mins-long action, all subsequent requests will suffer from the
>latency
>overhead of timeout.

Overload of a given container is determined by its concurrency limit. Today that is "1". If
1 request is active in a container, and that is the only container, we need more containers.
As soon as all containers reached their maximum concurrency, we need to scale up.

We do the same thing today I believe.

Does that answer your questions? (Sorry for the broken quote layout, my mail client screws
these up)

Cheers,
Markus


Mime
View raw message