openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Thoemmes" <markus.thoem...@de.ibm.com>
Subject Re: Proposal on a future architecture of OpenWhisk
Date Fri, 20 Jul 2018 14:36:41 GMT
Hi Chetan,

>As activations may be bulky (1 MB max) it may not be possible to keep
>them in memory even if there is small incoming rate depending on how
>fast they get consumed. Currently usage of Kafka takes the pressure
>of
>Controller and helps in keeping them stable. So I suspect we may need
>to make use of buffer more often to keep pressure off Controllers,
>specially for heterogeneous loads and when system is making full use
>of cluster capacity.

Note that we can also excert client side buffering in this case. If we enable end-to-end streaming
of parameters/results (which is very much facilitated with my proposal, though not impossible
even in the current architecture), you can refuse to consume the HTTP requests entity until
you are able to pass it downstream to a container. That way, nothing (or close-to-nothing)
is buffered in memory of the controller. In an overload scenario, where the wait time for
resources is expected to be high, this can be altered to buffer into something like an overflow
queue. In steady state, the controller should not buffer more than necessary.

To your proposal: If I understand correctly, what you are laying out is heavily reliant on
detaching the Invoker from the hosts where containers run (since you assume a ContainerPool
to handle a lot more containers than today. This assumption moves the general architecture
quite close to what I am proposing. Essentially, you propose a load-balancing algorithm in
front of the controllers (in my picture), since these own their respective Containerpool (so
to say).

In essence, this would then be a solution for the concerns Dominic mentioned with load imbalance
in between controllers. This is exactly what we've been discussing earlier (where I proposed
pubsub vs. Kafka) etc. and I believe both solutions do go in a similar direction.

Sorry if I simplified this too much and let me know if I'm overlooking an aspect here.

Cheers,
Markus
 
   
 
Mit freundlichen Grüßen / Kind regards  
   
 
Markus Thoemmes    
Software Engineer - IBM Bluemix  
E904   	 		 			  			 		 		 			   		 		 			 Phone: 			 +49-172-2684500 			  IBM Deutschland
GmbH 			  		 		 			 Email: 			 markus.thoemmes@de.ibm.com 			  Am Fichtenberg 1 		 		 			
  			   			  71083 Herrenberg 		 		 			  			 		 		 			 IBM Deutschland GmbH / Vorsitzender
des Aufsichtsrats: Martin Jetter
 			Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr.
Christian Keller, Ivo Koerner, Markus Koerner
 			Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 /
WEEE-Reg.-Nr. DE 99369940 		 	         

-----Chetan Mehrotra <chetan.mehrotra@gmail.com> wrote: -----

>To: dev@openwhisk.apache.org
>From: Chetan Mehrotra <chetan.mehrotra@gmail.com>
>Date: 07/20/2018 02:13PM
>Subject: Re: Proposal on a future architecture of OpenWhisk
>
>Hi Markus,
>
>Some of the points below are more of hunch and thinking out loud and
>hence may not be fully objective or valid !.
>
>> The buffer I described above is a per action "invoke me once
>resources are available" buffer
>
>> 1. Do we need to persist an in-memory queue that waits for
>resources to be created by the ContainerManager?
>> 2. Do we need a shared queue between the Controllers to enable
>work-stealing in cases where multiple Controllers wait for resources?
>
>As activations may be bulky (1 MB max) it may not be possible to keep
>them in memory even if there is small incoming rate depending on how
>fast they get consumed. Currently usage of Kafka takes the pressure
>of
>Controller and helps in keeping them stable. So I suspect we may need
>to make use of buffer more often to keep pressure off Controllers,
>specially for heterogeneous loads and when system is making full use
>of cluster capacity.
>
>> That could potentially open up the possibility to use a technology
>more geared towards Pub/Sub, where subscribers per action are more
>cheap to implement than on Kafka?
>
>That would certainly be one option to consider. System like RabbitMQ
>can support lots of queues. They may pose a problem on how to
>efficiently consumer resources from all such queues.
>
>Another option I was thinking was to have a design similar to current
>involving controller and invoker but having a single queue shared
>between controller and invoker and using action as the partition key
>and multiple invokers forming a consumer group. This enables using
>Kafka inbuilt support for horizontal scaling by adding new invoker
>and
>having Kafka assign partitions to it.
>
>Such a design would have following key aspects
>
>A - Supporting Queue Per Action (sort of!)
>--------------------------------------------
>One way to do that is have topic per action which does not work well
>with Kafka. Instead of that we can have each Invoker maintain a
>dedicated ContainerPool per action which independently reads message
>from the partition hosting its action. While doing that it would
>filter out activations for other actions which happen to land on same
>partition. An Invoker would spin of this independent container pool
>after observing the rate of activation per action crosses certain
>threshold. We can leverage this per action pool for better
>autoscaling
>where it request more containers from the container orchestrator
>(k8s,
>mesos) based on rate of activations and lag. Here system can learn
>itself from usage pattern on how to allocate resources.
>
>This may lead to multiple streaming/replay of same message from Kafka
>but given the high rate with which messages can be fetched from
>single
>partition it may work out fine. This would also require some custom
>offset management where primary offset belongs to consumer bound to
>Invoker but also records the offset per action
>
>B - Hot Partition Handling
>----------------------------------
>
>A known drawback of using non random partitioning is hot partitions
>whereby some Invoker may not get much work but some invoker may get
>lots of work. This would managed by 2 things
>
>1. High Container Density per Invoker - If we can reduce the Invoker
>state it should be possible to have a single Invoker handle lot more
>container concurrently. So even a hot Invoker would still be able to
>make use of container distributed across multiple hosts and thus make
>optimal use of cluster resources
>
>2. Reassign Activations - If a action specific container pool sees
>that its not able to keep up with rate of incoming activations of the
>action it can then pull it out and add it back to same queue but then
>assign to a less loaded invoker's partition directly.
>
>Thoughts above digress from current proposal being discussed and may
>not be presented in a very coherent way. However want to dump them
>down to see if they make any sense or not. If there is some potential
>then I can try to draft a more detailed proposal on wiki. Key motive
>here is to get a design where we have independent container pool
>pulling activations independently and auto scaling themselves based
>on
>allowed limits and get us closer to a consumer per action topic kind
>of design in a dynamic way!
>
>Chetan Mehrotra
>
>On Thu, Jul 19, 2018 at 6:06 PM Markus Thoemmes
><markus.thoemmes@de.ibm.com> wrote:
>>
>> Hi Chetan,
>>
>> >Currently one aspect which is not clear is does Controller has
>access
>> >to
>> >
>> >1. Pool of prewarm containers - Container of base image where
>/init
>> >is
>> >yet not done. So these containers can then be initialized within
>> >Controller
>> >2. OR Pool of warm container bound to specific user+action. These
>> >containers would possibly have been initialized by
>ContainerManager
>> >and then it allocates them to controller.
>>
>> The latter case is what I had in mind. The controller only knows
>containers that are already ready to call /run on.
>>
>> Pre-Warm containers are an implementation detail to the Controller.
>The ContainerManager can keep them around to be able to answer demand
>for specific resources more quickly, but the Controller doesn't care.
>It only knows warm containers.
>>
>> >Can you elaborate this bit more i.e. how scale up logic would work
>> >and
>> >is asynchronous?
>> >
>> >I think above aspect (type of pool) would have bearing on scale up
>> >logic. If an action was not in use so far then when first request
>> >comes (i.e. 0-1 scale up case) would Controller ask
>ContainerManager
>> >for specific action container and then wait for its setup and then
>> >execute it. OR if it has a generic pool then it takes one and
>> >initializes it and use it. And if its not done synchronously then
>> >would such an action be put to overflow queue.
>>
>> In this specific example, the Controller will request a container
>from the ContainerManager and buffer the request until it finally has
>capacity to execute it. All subsequent requests will be put on the
>same buffer and a Container will be requested for each of them.
>>
>> Whether we put this buffer in an overflow queue (aka persist it)
>remains to be decided. If we keep it in memory, we have roughly the
>same guarantees as today. As Rodric mentioned though, we can improve
>certain failure scenarios (like waiting for a container in this case)
>by making this buffer more persistent. I'm not mentioning Kafka here
>for a reason, because in this case any persistent buffer is just
>fine.
>>
>> Also note that this is not necessarily the case of the overflow
>queue. The overflow queue is used for arbitrary requests once the
>ContainerManager cannot create more resources and thus requests need
>to wait.
>>
>> The buffer I described above is a per action "invoke me once
>resources are available" buffer, that could potentially be designed
>to be per Controller to not have the challenge of scaling it out.
>That of course has its downsides in itself, for instance: A buffer
>that spans all controllers would enable work-stealing between
>controllers with missing capacity and could mitigate some of
>load-imbalances that Dominic mentioned. We are entering then the same
>area that his proposal enters: The need of a queue per action.
>>
>> Conclusion is, we have 2 perspectives to look at this:
>>
>> 1. Do we need to persist an in-memory queue that waits for
>resources to be created by the ContainerManager?
>> 2. Do we need a shared queue between the Controllers to enable
>work-stealing in cases where multiple Controllers wait for resources?
>>
>> An important thing to note here: Since all of this is no longer
>happening on the critical path (stuff gets put on the queue only if
>it needs to wait for resources anyway), we can afford a solution that
>isn't as perfomant as Kafka might be. That could potentially open up
>the possibility to use a technology more geared towards Pub/Sub,
>where subscribers per action are more cheap to implement than on
>Kafka?
>>
>> Does that make sense? Hope that helps :). Thanks for the questions!
>>
>> Cheers,
>> Markus
>>
>
>


Mime
View raw message