openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Thoemmes" <markus.thoem...@de.ibm.com>
Subject Re: Proposal on a future architecture of OpenWhisk
Date Mon, 23 Jul 2018 11:21:06 GMT
Hi Dominic,

let's see if I can clarify the specific points one by one.

>1. Docker daemon performance issue.
>
>...
>
>That's the reason why I initially thought that a Warmed state would
>be kept
>for more than today's behavior.
>Today, containers would stay in the Warmed state only for 50ms, so it
>introduces PAUSE/RESUME in case action comes with the interval of
>more than
>50 ms such as 1 sec.
>This will lead to more loads on Docker daemon.

You're right that the docker daemon's throughput is indeed an issue.

Please note that PAUSE/RESUME are not executed via the docker daemon in performance
tuned environment but rather done via runc, which does not have such a throughput
issue because it's not a daemon at all. PAUSE/RESUME latencies are ~10ms for each
operation.

Further, the duration of the pauseGrace is not related to the overall architecture at
all. Rather, it's a so narrow to safe-guard against users stealing cycles from the
vendor's infrastructure. It's also a configurable value so you can tweak it as you
want.

The proposed architecture itself has no impact on the pauseGrace.

>
>And if the state of containers is changing like today, the state in
>ContainerManager would be frequently changing as well.
>This may induce a synchronization issue among controllers and, among
>ContainerManagers(in case there would be more than one
>ContainerManager).

The ContainerManager will NOT be informed about pause/unpause state changes and it
doesn't need to. I agree that such a behavior would generate serious load on the
ContainerManager, but I think it's unnecessary.

>2. Proxy case.
>
>...
>
>If it goes this way, ContainerManager should know all the status of
>containers in all controllers to make a right decision and it's not
>easy to
>synchronize all the status of containers in controllers.
>If it does not work like this, how can controller2 proxy requests to
>controller1 without any information about controller1's status?


The ContainerManager distributes a list of containers across all controllers.
If it does not have enough containers at hand to give one to each controller,
it instead tells controller2 to proxy to controller1, because the ContainerManager
knows at distribution-time, that controller1 has such a container.

No synchronization needed between controllers at all.

If controller1 gets more requests than the single container can handle, it will
request more containers, so eventually controller2 will get its own.

Please refer to https://lists.apache.org/thread.html/84a7b8171b90719c2f7aab86bea48a7e7865874c4e54f082b0861380@%3Cdev.openwhisk.apache.org%3E
for more information on that protocol.


>3. Intervention among multiple actions
>
>If the concurrency limit is 1, and the container lifecycle is managed
>like
>today, intervention among multiple actions can happen again.
>For example, the maximum number of containers which can be created by
>a
>user is 2, and ActionA and ActionB invocation requests come
>alternatively,
>controllers will try to remove and recreate containers again and
>again.
>I used an example with a small number of max container limit for
>simplicity, but it can happen with a higher limit as well.
>
>And though concurrency limit is more than 1 such as 3, it also can
>happen
>if actions come more quickly than the execution time of actions.

The controller will never try to delete a container at all, neither does it's
pool of managed containers has a limit.
If it doesn't have a container for ActionA it will request one from the ContainerManager.
If it doesn't have one for ActionB it will request one from the ContainerManager.

There will be 2 containers in the system and assuming that the ContainerManager has enough
resources to keep those 2 containers alive, it will not delete them.

The controllers by design cannot cause the behavior you're describing. The architecture is
actually build around fixing this exact issue (eviction due to multiple heavy users in the
system).

>4. Is concurrency per container controlled by users in a per-action
>based
>way?
>Let me clarify my question about concurrency limit.
>
>If concurrency per container limit is more than 1, there could be
>multiple
>actions being invoked at some point.
>If the action requires high memory footprint such as 200MB or 150MB,
>it can
>crash if the sum of memory usage of concurrent actions exceeds the
>container memory.
>(In our case(here), some users are executing headless-chrome and
>puppeteer
>within actions, so it could happen under the similar situation.)
>
>So I initially thought concurrency per container is controlled by
>users in
>a per-action based way.
>If concurrency per container is only configured by OW operators
>statically,
>some users may not be able to invoke their actions correctly in the
>worst
>case though operators increased the memory of the biggest container
>type.
>
>And not only for this case, there could be some more reasons that
>some
>users just want to invoke their actions without per-container
>concurrency
>but the others want it for better throughput.
>
>So we may need some logic for users to take care of per-container
>concurrency for each actions.

Yes, the intention is to provide exactly what you're describing, maybe I worded it weirdly
in my last response.

This is not relevant for the architecture though.


>5. Better to wait for the completion rather than creating a new
>container.
>According to the workload, it would be better to wait for the
>previous
>execution rather than creating a new container because it takes upto
>500ms
>~ 1s.
>Even though the concurrency limit is more than 1, it still can happen
>if
>there is no logic to cumulate invocations and decide whether to
>create a
>new container or waiting for the existing container.

The proposed asynchronous protocol between controller and ContainerManager accomplishes this
by design:

If a controller does not have the resources to execute the current request, it requests those
resources.
The ContainerManager updates resources asynchronously.
The Controller will schedule the outstanding request as soon as it gets resources for it.
It does not care
if those resources are  becoming free because another request finished or because it got a
fresh container
from the ContainerManager. Requests will always be dispatched as soon as resources are free.

>6. HA of ContainerManager.
>Since it is mandatory to deploy the system without any downtime to
>use it
>for production, we need to support HA of ContainerManager.
>It means the state of ContainerManager should be replicated among
>replicas.
>(No matter which method we use between master/slave or clustering.)
>
>If ContainerManager knows about the status of each container, it
>would not
>be easy to support HA with its eventual consistent nature.
>If it does only know which containers are assigned to which
>controllers, it
>cannot handle the edge case as I mentioned above.

I agree, HA is mandatory. Since the ContainerManager operates only on the container creation/deletion
path,
we can probably afford to persist its state into something like Redis. If it crashes, the
slave instance
can take over immediately without any eventual-consistency concerns or downtime.

Also note that a downtime in the ContainerManager will ONLY cause an impact on the ability
to create containers.
Workloads that already have containers created will continue to work just fine.


Does that answer/mitigate your concerns?

Cheers,
Markus

   
>To: dev@openwhisk.apache.org
>From: Dominic Kim <style9595@gmail.com>
>Date: 07/23/2018 12:48PM
>Subject: Re: Proposal on a future architecture of OpenWhisk
>
>Dear Markus.
>
>I may not correctly understand the direction of new architecture.
>So let me describe my concerns in more details.
>
>Since that is a future architecture of OpenWhisk and requires many
>breaking
>changes, I think it should at least address all known issues.
>So I focused on figuring out whether it handles all issues which are
>reported in my proposal.
>(
>INVALID URI REMOVED
>_confluence_display_OPENWHISK_Autonomous-2BContainer-2BScheduling&d=D
>wIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=hrbwAtsFbpjFv44gWxWuA_MH56HIaR3jKAHn
>WL2Si9M&m=yYWiw1fuZyVCjEpmC49VlFoo29lr1Cq39Bcayz65phg&s=818SNwNuYXfpL
>llgKfuK2DGMVrXBXKfE9Vbmf35IYI8&e=
>)
>
>1. Docker daemon performance issue.
>
>The most critical issue is poor performance of docker daemon.
>Since it is not inherently designed for high throughput or concurrent
>processing, Docker daemon shows poor performance in comparison with
>OW.
>In OW(serverless) world, action execution can be finished within 5ms
>~
>10ms, but the Docker daemon shows 100 ~ 500ms latency.
>Still, we can take advantage of Prewarm and Warmed containers, but
>under
>the situation where container creation/deletion/pausing/resuming
>happen
>frequently and the situation lasted for long-term, the requests are
>delayed
>and even the Docker daemon crashed.
>So I think it is important to reduce the loads(requests) against the
>Docker
>daemon.
>
>That's the reason why I initially thought that a Warmed state would
>be kept
>for more than today's behavior.
>Today, containers would stay in the Warmed state only for 50ms, so it
>introduces PAUSE/RESUME in case action comes with the interval of
>more than
>50 ms such as 1 sec.
>This will lead to more loads on Docker daemon.
>
>And if the state of containers is changing like today, the state in
>ContainerManager would be frequently changing as well.
>This may induce a synchronization issue among controllers and, among
>ContainerManagers(in case there would be more than one
>ContainerManager).
>
>So I think containers should be running for more than today's
>pauseGrace
>time.
>With more than 1 concurrency limit per container, it would also be
>better
>to keep containers running(not paused) for more than 50ms.
>
>2. Proxy case.
>
>In the edge case where a container only exists in controller1, how
>can
>controller2 decide to proxy the request to controller1 rather than
>just
>creating its own container?
>If it asks to ContainerManager, ContainerManager should know the
>state of
>the container in controller1.
>If the container in controller1 is already busy, it would be better
>to
>create a new container in controller2 rather than proxying the
>requests to
>controller1.
>
>If it goes this way, ContainerManager should know all the status of
>containers in all controllers to make a right decision and it's not
>easy to
>synchronize all the status of containers in controllers.
>If it does not work like this, how can controller2 proxy requests to
>controller1 without any information about controller1's status?
>
>3. Intervention among multiple actions
>
>If the concurrency limit is 1, and the container lifecycle is managed
>like
>today, intervention among multiple actions can happen again.
>For example, the maximum number of containers which can be created by
>a
>user is 2, and ActionA and ActionB invocation requests come
>alternatively,
>controllers will try to remove and recreate containers again and
>again.
>I used an example with a small number of max container limit for
>simplicity, but it can happen with a higher limit as well.
>
>And though concurrency limit is more than 1 such as 3, it also can
>happen
>if actions come more quickly than the execution time of actions.
>
>4. Is concurrency per container controlled by users in a per-action
>based
>way?
>Let me clarify my question about concurrency limit.
>
>If concurrency per container limit is more than 1, there could be
>multiple
>actions being invoked at some point.
>If the action requires high memory footprint such as 200MB or 150MB,
>it can
>crash if the sum of memory usage of concurrent actions exceeds the
>container memory.
>(In our case(here), some users are executing headless-chrome and
>puppeteer
>within actions, so it could happen under the similar situation.)
>
>So I initially thought concurrency per container is controlled by
>users in
>a per-action based way.
>If concurrency per container is only configured by OW operators
>statically,
>some users may not be able to invoke their actions correctly in the
>worst
>case though operators increased the memory of the biggest container
>type.
>
>And not only for this case, there could be some more reasons that
>some
>users just want to invoke their actions without per-container
>concurrency
>but the others want it for better throughput.
>
>So we may need some logic for users to take care of per-container
>concurrency for each actions.
>
>5. Better to wait for the completion rather than creating a new
>container.
>According to the workload, it would be better to wait for the
>previous
>execution rather than creating a new container because it takes upto
>500ms
>~ 1s.
>Even though the concurrency limit is more than 1, it still can happen
>if
>there is no logic to cumulate invocations and decide whether to
>create a
>new container or waiting for the existing container.
>
>
>6. HA of ContainerManager.
>Since it is mandatory to deploy the system without any downtime to
>use it
>for production, we need to support HA of ContainerManager.
>It means the state of ContainerManager should be replicated among
>replicas.
>(No matter which method we use between master/slave or clustering.)
>
>If ContainerManager knows about the status of each container, it
>would not
>be easy to support HA with its eventual consistent nature.
>If it does only know which containers are assigned to which
>controllers, it
>cannot handle the edge case as I mentioned above.
>
>
>
>Since many parts of the architecture are not addressed yet, I think
>it
>would be better to separate each parts and discuss further deeply.
>But in the big picture, I think we need to figure out whether it can
>handle
>or at least alleviate all known issues or not first.
>
>
>Best regards,
>Dominic
>
>
>2018-07-21 1:36 GMT+09:00 David P Grove <groved@us.ibm.com>:
>
>>
>>
>> Tyson Norris <tnorris@adobe.com.INVALID> wrote on 07/20/2018
>12:24:07 PM:
>> >
>> > On Logging, I think if you are considering enabling concurrent
>> > activation processing, you will encounter that the only approach
>to
>> > parsing logs to be associated with a specific activationId, is to
>> > force the log output to be structured, and always include the
>> > activationId with every log message. This requires a change at
>the
>> > action container layer, but the simpler thing to do is to
>encourage
>> > action containers to provide a structured logging context that
>> > action developers can (and must) use to generate logs.
>>
>> Good point.  I agree that if there is concurrent activation
>processing in
>> the container, structured logging is the only sensible thing to do.
>>
>>
>> --dave
>>
>


Mime
View raw message