openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyson Norris <>
Subject Re: Invoker HA on Mesos
Date Fri, 30 Mar 2018 17:37:43 GMT
Hooking into pause/unpause/destroy of containers seems plausible, instead of hooking into the
Maps in ContainerPool. 

So in the existing PR, the ContainerPool uses an alternate impl for Map to store freePool
and prewarmPool, and that alternate impl initiates the attach to existing containers, when
it becomes active. 

The ContainerPool could instead potentially delegate to the ContainerFactory, e.g. a ContainerFactory.reviveContainers(childFactory)
=> (freePool, prewarmPool) - we will still need a way to trigger this on demand (e.g. when
the standby pool becomes active, in our case, but I think that is a minor detail). 

I can try it out; I will be out next week, but if you test any of this in the meantime, let
me know.


> On Mar 30, 2018, at 9:58 AM, David P Grove <> wrote:
> Tyson Norris <> wrote on 03/27/2018 06:25:59 PM:
>> Do you have an example of the labels working? I guess the labels are
>> changed over time through the lifecycle of the container?
> Apologies for brutally chopping the email chain; my mail client made a
> horrible hash of it.
> Right now, all we are doing with Kube labels is to label each action
> container with its owning invoker on startup.  This lets us delete orphaned
> containers if the invoker crashes and needs to be restarted.  The labeling
> happens at [1] and the removal of orphans using the labels at [2].
> I think the Kube-native version of part of what you are doing with the
> DistributedData for Mesos would be to add and remove additional labels to
> give us the option of attaching a new invoker instance to orphaned
> containers instead of just destroying them.   Interacting with the
> Kubernetes API server to do a labeling operation takes around 10ms, so we
> couldn't do this on a truly hot path.  But we could probably afford to
> update container labels in parallel with pause/unpause operations, which
> could enable re-attachment to any paused containers.
> --dave
> [1]
> [2]

View raw message