openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Thömmes <markusthoem...@me.com>
Subject Re: concurrent requests on actions
Date Mon, 01 May 2017 21:22:14 GMT
Hi Tyson,

Sounds like you did a lot of investigation here, thanks a lot for that :)

Seeing the numbers, 4 RPS in the "off" case seem very odd. The Travis build that runs the
current system as is also reaches 40+ RPS. So we'd need to look at a mismatch here.

Other than that I'd indeed suspect a great improvement in throughput from your work!

Implementationwise I don't have a strong opionion but it might be worth to discuss the details
first and land your impl. once all my staging is done (the open PRs). That'd ease git operation.
If you want to discuss your impl. now I suggest you send a PR to my new-containerpool branch
and share the diff here for discussion.

Cheers,
Markus

Von meinem iPhone gesendet

> Am 01.05.2017 um 23:16 schrieb Tyson Norris <tnorris@adobe.com>:
> 
> Hi Michael -
> Concurrent requests would only reuse a running/warm container for same-action requests.
So if the action has bad/rogue behavior, it will limit its own throughput only, not the throughput
of other actions.
> 
> This is ignoring the current implementation of the activation feed, which I guess is
susceptible to a flood of slow running activations. If those activations are for the same
action, running concurrently should be enough to not starve the system for other activations
(with faster actions) to be processed. In case they are all different actions, OR not allowed
to execute concurrently, then in the name of quality-of-service, it may also be desirable
to reserve some resources (i.e. separate activation feeds) for known-to-be-faster actions,
so that fast-running actions are not penalized for existing alongside the slow-running actions.
This would require a more complicated throughput test to demonstrate.
> 
> Thanks
> Tyson
> 
> 
> 
> 
> 
> 
> 
> On May 1, 2017, at 1:13 PM, Michael Marth <mmarth@adobe.com<mailto:mmarth@adobe.com>>
wrote:
> 
> Hi Tyson,
> 
> 10x more throughput, i.e. Being able to run OW at 1/10 of the cost - definitely worth
looking into :)
> 
> Like Rodric mentioned before I figured some features might become more complex to implement,
like billing, log collection, etc. But given such a huge advancement on throughput that would
be worth it IMHO.
> One thing I wonder about, though, is resilience against rogue actions. If an action is
blocking (in the Node-sense, not the OW-sense), would that not block Node’s event loop and
thus block other actions in that container? One could argue, though, that this rogue action
would only block other executions of itself, not affect other actions or customers. WDYT?
> 
> Michael
> 
> 
> 
> 
> On 01/05/17 17:54, "Tyson Norris" <tnorris@adobe.com<mailto:tnorris@adobe.com>>
wrote:
> 
> Hi All -
> I created this issue some time ago to discuss concurrent requests on actions: [1] Some
people mentioned discussing on the mailing list so I wanted to start that discussion.
> 
> I’ve been doing some testing against this branch with Markus’s work on the new container
pool: [2]
> I believe there are a few open PRs in upstream related to this work, but this seemed
like a reasonable place to test against a variety of the reactive invoker and pool changes
- I’d be interested to hear if anyone disagrees.
> 
> Recently I ran some tests
> - with “throughput.sh” in [3] using concurrency of 10 (it will also be interesting
to test with the --rps option in loadtest...)
> - using a change that checks actions for an annotation “max-concurrent” (in case
there is some reason actions want to enforce current behavior of strict serial invocation
per container?)
> - when scheduling an actions against the pool, if there is a currently “busy” container
with this action, AND the annotation is present for this action, AND concurrent requests <
max-concurrent, the this container is used to invoke the action
> 
> Below is a summary (approx 10x throughput with concurrent requests) and I would like
to get some feedback on:
> - what are the cases for having actions that require container isolation per request?
node is a good example that should NOT need this, but maybe there are cases where it is more
important, e.g. if there are cases where stateful actions are used?
> - log collection approach: I have not attempted to resolve log collection issues; I would
expect that revising the log sentinel marker to include the activation ID would help, and
logs stored with the activation would include interleaved activations in some cases (which
should be expected with concurrent request processing?), and require some different logic
to process logs after an activation completes (e.g. logs emitted at the start of an activation
may have already been collected as part of another activation’s log collection, etc).
> - advice on creating a PR to discuss this in more detail - should I wait for more of
the container pooling changes to get to master? Or submit a PR to Markus’s new-containerpool
branch?
> 
> Thanks
> Tyson
> 
> Summary of loadtest report with max-concurrent ENABLED (I used 10000, but this limit
wasn’t reached):
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Target URL:          https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2F192.168.99.100%2Fapi%2Fv1%2Fnamespaces%2F_%2Factions%2FnoopThroughputConcurrent%3Fblocking%3Dtrue&data=02%7C01%7C%7C796dfc317cde44c9e83908d490ce7faa%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636292663971484169&sdata=uv9kYh5uBoIDXDlEivgMClJ6TDGDmzTdKOgZPZjkBko%3D&reserved=0
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Max requests:        10000
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Concurrency level:   10
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Agent:               keepalive
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Completed requests:  10000
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Total errors:        0
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Total time:          241.900480915 s
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Requests per second: 41
> [Sat Apr 29 2017 16:32:37 GMT+0000 (UTC)] INFO Mean latency:        241.7 ms
> 
> Summary of loadtest report with max-concurrent DISABLED:
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Target URL:          https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2F192.168.99.100%2Fapi%2Fv1%2Fnamespaces%2F_%2Factions%2FnoopThroughput%3Fblocking%3Dtrue&data=02%7C01%7C%7C796dfc317cde44c9e83908d490ce7faa%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636292663971494178&sdata=h6sMS0s2WQXFMcLg8sSAq%2F56p%2F%2BmVmth%2B%2FsqTOVmeAc%3D&reserved=0
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Max requests:        10000
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Concurrency level:   10
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Agent:               keepalive
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Completed requests:  10000
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Total errors:        19
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Total time:          2770.658048791 s
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Requests per second: 4
> [Sat Apr 29 2017 19:21:51 GMT+0000 (UTC)] INFO Mean latency:        2767.3 ms
> 
> 
> 
> 
> 
> [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenwhisk%2Fopenwhisk%2Fissues%2F2026&data=02%7C01%7C%7C796dfc317cde44c9e83908d490ce7faa%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636292663971494178&sdata=eg%2FsSPRQYapQHPNbfMLCW%2B%2F1yAqn8zSo0nJ5yQjmkns%3D&reserved=0
> [2] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarkusthoemmes%2Fopenwhisk%2Ftree%2Fnew-containerpool&data=02%7C01%7C%7C796dfc317cde44c9e83908d490ce7faa%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636292663971494178&sdata=IZcN9szW71SdL%2ByssJm9k3EgzaU4b5idI5yFWyR7%2BL4%3D&reserved=0
> [3] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarkusthoemmes%2Fopenwhisk-performance&data=02%7C01%7C%7C796dfc317cde44c9e83908d490ce7faa%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636292663971494178&sdata=WkOlhTsplKQm6mUkZtwWLXzCrQg%2FUmKtqOErIw6gFAA%3D&reserved=0
> 

Mime
View raw message