openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominic Kim <style9...@gmail.com>
Subject Re: Proposal on a future architecture of OpenWhisk
Date Thu, 19 Jul 2018 02:31:41 GMT
Dear Markus.
Thank you for the quick response.

> In the proposal, the semantics of talking to the container do not change
from what we have today. If the http request fails for whatever reason
while "in-flight", processing cannot be completed, just like if an invoker
crashes. Note that we commit the message immediatly after reading it today,
which leads to "at-most-once" semantics. OpenWhisk does not support
"at-least-once" today. To do so, you'd retry from the outside.

Ah yes. Now I remember I wondered why OS doesn't support "at-least-once"
semantic.
This is the question apart from the new architecture, but is this because
of the case that user can execute the non-idempotent action?
So though an invoker is failed, still action could be executed and it could
cause some side effects such as repeating the action which requires
"at-most-once" semantic more than once?


> I don't believe the ContainerManager needs to do that much honestly. In
the Kubernetes case for instance it only asks the Kube API for pods and
then keeps a list of these pods per action. Further it divides this list
whenever a new container is added/removed. I think we can push this quite
far in a master/slave fashion as Brendan mentioned. This is guessing
though, it'll be crucial to measure the throughput that one instance
actually can provide and then decide on whether that's feasible or not.
>
> As its state isn't moving at a super fast pace, we can probably afford to
persist it into something like redis or etcd for the failover to take over
if one dies.
>
> Of course I'm very open for scaling it out horizontally if that's
achievable.

Yes, reducing the requests against Docker daemon seems the right way to go.
BTW, how would long warmed containers be kept in the new architecture? Is
it a 1 or 2 order of magnitude in seconds?


> Assuming general round-robin (or even random scheduling) in front of the
controllers would even out things to a certain extent would they?
>
> Another probably feasible solution is to implement session stickiness or
hashing as you mentioned in todays call. Another comment that was raised
during today's would come into play there as well: We could change the
container division algorithm to not divide evenly but to only give
containers to those controllers that requested them. In conjunction with
session stickiness, that could yield better load distribution results
(given the session stickiness is smart enough to divide appropriately.

Yes, that could an option.
I concern it might cause some load imbalance among controller as well.
And another question comes up, how can we keep stick session for multiple
controllers and for multiple actions respectively?



> Overload of a given container is determined by its concurrency limit.
Today that is "1". If 1 request is active in a container, and that is the
only container, we need more containers. As soon as all containers reached
their maximum concurrency, we need to scale up.

In the new architecture, concurrency limit is controlled by users in a
per-action based way?
So in case a user wants to execute the long-running action, does he
configure the concurreny limit for the action?

And if concurrency limit is 1, in case action container is possessed,
wouldn't controllers request a container again and again?
And if it only allows container creation in a synchronous way(creating one
by one), couldn't it be a burden in case a user wants a huge number
of(100~200) simultaneous invocations?


Please bear with many questions.
I am also one of the advocates who want to improve and enhance OW in a such
that way.
I hope my question helps to build more delicate architecture.

Thanks
Best regards
Dominic

2018-07-19 2:16 GMT+09:00 Markus Thoemmes <markus.thoemmes@de.ibm.com>:

> Hi Dominic,
>
> thanks for your feedback, let's see...
>
> >1. Buffering of activation and failure handling.
> >
> >As of now, Kafka acts as a kind of buffer in case activation
> >processing is
> >a bit delayed due to some reasons such as invoker failure.
> >If Kafka is only used for the overflowing case, how can it guarantee
> >"at
> >least once" activation processing?
> >For example, if a controller receives the requests and it crashed
> >before
> >the activation is complete. How can other alive controllers or the
> >restored
> >controller handle it?
>
> In the proposal, the semantics of talking to the container do not change
> from what we have today. If the http request fails for whatever reason
> while "in-flight", processing cannot be completed, just like if an invoker
> crashes. Note that we commit the message immediatly after reading it today,
> which leads to "at-most-once" semantics. OpenWhisk does not support
> "at-least-once" today. To do so, you'd retry from the outside.
>
> >2. A bottleneck in ContainerManager.
> >
> >Now ContainerManage has many critical logics.
> >It takes care of the container lifecycle, logging, scheduling and so
> >on.
> >Also, it should be aware of whole container state as well as
> >controller
> >status and distribute containers among alive controllers.
> >It might need to do some health checking to/from all controllers and
> >containers(may be orchestrator such as k8s).
> >
> >I think in this case ContainerManager can be a bottleneck as the size
> >of
> >the cluster grows.
> >And if we add more nodes to scale out ContainerManager or to prepare
> >for
> >the SPOF, then all states of ContainerManager should be shared among
> >all
> >nodes.
> >If we take master/slave approach, the master would become a
> >bottleneck at
> >some point, and if we take a clustering approach, we need some
> >mechanism to
> >synchronize the cluster status among ContainerManagers.
> >And this procedure should be done in 1 or 2 order of magnitude in
> >milliseconds.
> >
> >Do you have anything in your mind to handle this?
>
> I don't believe the ContainerManager needs to do that much honestly. In
> the Kubernetes case for instance it only asks the Kube API for pods and
> then keeps a list of these pods per action. Further it divides this list
> whenever a new container is added/removed. I think we can push this quite
> far in a master/slave fashion as Brendan mentioned. This is guessing
> though, it'll be crucial to measure the throughput that one instance
> actually can provide and then decide on whether that's feasible or not.
>
> As its state isn't moving at a super fast pace, we can probably afford to
> persist it into something like redis or etcd for the failover to take over
> if one dies.
>
> Of course I'm very open for scaling it out horizontally if that's
> achievable.
>
> >3. Imbalance among controllers.
> >
> >I think there could be some imbalance among controllers.
> >For example, there are 3 controllers with 3, 1, and 1 containers
> >respectively for the given action.
> >In some case, 1 containers in the controller1 might be overloaded but
> >2
> >containers in controller2 can be available.
> >If the number of containers for the given action belongs to each
> >controller
> >varies, it could happen more easily.
> >This is because controllers are not aware of the status of other
> >controllers.
> >So in some case, some action containers are overloaded but the others
> >may
> >handle just moderate requests.
> >Then each controller may request more containers instead of utilizing
> >existing(but in other controllers) containers, and this can lead to
> >the
> >waste of resources.
>
> Assuming general round-robin (or even random scheduling) in front of the
> controllers would even out things to a certain extent would they?
>
> Another probably feasible solution is to implement session stickiness or
> hashing as you mentioned in todays call. Another comment that was raised
> during today's would come into play there as well: We could change the
> container division algorithm to not divide evenly but to only give
> containers to those controllers that requested them. In conjunction with
> session stickiness, that could yield better load distribution results
> (given the session stickiness is smart enough to divide appropriately.
>
> >4. How do controllers determine whether to create more containers?
> >
> >Let's say, a controller has only one container for the given action.
> >How controllers recognize this container is overloaded and need more
> >containers to create?
> >If the action execution time is short, it can calculate the number of
> >buffered activation for the given action.
> >But the action execution time is long, let's say 1 min or 2 mins,
> >then even
> >though there is only 1 activation request in the buffer, the
> >controller
> >needs to create more containers.
> >(Because subsequent activation request will be delayed for 1 or
> >2mins.)
> >Since we cannot know the execution time of action in advance, we may
> >need a
> >sort of timeout(of activation response) approach for all actions.
> >But still, we cannot know how much time of execution are remaining
> >for the
> >given action after the timeout occurred.
> >Further, if a user requests 100 or 200 concurrent invocations with a
> >2
> >mins-long action, all subsequent requests will suffer from the
> >latency
> >overhead of timeout.
>
> Overload of a given container is determined by its concurrency limit.
> Today that is "1". If 1 request is active in a container, and that is the
> only container, we need more containers. As soon as all containers reached
> their maximum concurrency, we need to scale up.
>
> We do the same thing today I believe.
>
> Does that answer your questions? (Sorry for the broken quote layout, my
> mail client screws these up)
>
> Cheers,
> Markus
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message