openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Thoemmes" <markus.thoem...@de.ibm.com>
Subject Re: Proposal on a future architecture of OpenWhisk
Date Thu, 19 Jul 2018 12:36:20 GMT
Hi Chetan,

>Currently one aspect which is not clear is does Controller has access
>to
>
>1. Pool of prewarm containers - Container of base image where /init
>is
>yet not done. So these containers can then be initialized within
>Controller
>2. OR Pool of warm container bound to specific user+action. These
>containers would possibly have been initialized by ContainerManager
>and then it allocates them to controller.

The latter case is what I had in mind. The controller only knows containers that are already
ready to call /run on.

Pre-Warm containers are an implementation detail to the Controller. The ContainerManager can
keep them around to be able to answer demand for specific resources more quickly, but the
Controller doesn't care. It only knows warm containers.

>Can you elaborate this bit more i.e. how scale up logic would work
>and
>is asynchronous?
>
>I think above aspect (type of pool) would have bearing on scale up
>logic. If an action was not in use so far then when first request
>comes (i.e. 0-1 scale up case) would Controller ask ContainerManager
>for specific action container and then wait for its setup and then
>execute it. OR if it has a generic pool then it takes one and
>initializes it and use it. And if its not done synchronously then
>would such an action be put to overflow queue.

In this specific example, the Controller will request a container from the ContainerManager
and buffer the request until it finally has capacity to execute it. All subsequent requests
will be put on the same buffer and a Container will be requested for each of them. 

Whether we put this buffer in an overflow queue (aka persist it) remains to be decided. If
we keep it in memory, we have roughly the same guarantees as today. As Rodric mentioned though,
we can improve certain failure scenarios (like waiting for a container in this case) by making
this buffer more persistent. I'm not mentioning Kafka here for a reason, because in this case
any persistent buffer is just fine.

Also note that this is not necessarily the case of the overflow queue. The overflow queue
is used for arbitrary requests once the ContainerManager cannot create more resources and
thus requests need to wait.

The buffer I described above is a per action "invoke me once resources are available" buffer,
that could potentially be designed to be per Controller to not have the challenge of scaling
it out. That of course has its downsides in itself, for instance: A buffer that spans all
controllers would enable work-stealing between controllers with missing capacity and could
mitigate some of load-imbalances that Dominic mentioned. We are entering then the same area
that his proposal enters: The need of a queue per action.

Conclusion is, we have 2 perspectives to look at this:

1. Do we need to persist an in-memory queue that waits for resources to be created by the
ContainerManager?
2. Do we need a shared queue between the Controllers to enable work-stealing in cases where
multiple Controllers wait for resources?
 
An important thing to note here: Since all of this is no longer happening on the critical
path (stuff gets put on the queue only if it needs to wait for resources anyway), we can afford
a solution that isn't as perfomant as Kafka might be. That could potentially open up the possibility
to use a technology more geared towards Pub/Sub, where subscribers per action are more cheap
to implement than on Kafka?

Does that make sense? Hope that helps :). Thanks for the questions!

Cheers,
Markus


Mime
View raw message