openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyson Norris <tnor...@adobe.com.INVALID>
Subject Re: Improving support for UI driven use cases
Date Thu, 06 Jul 2017 05:44:43 GMT
I meant to add : I will work out with Dragos a time to propose asap, and get back to the group
so that we can negotiate a meeting time that will work for everyone who wants to attend in
realtime.

Thanks
Tyson

On Jul 5, 2017, at 10:42 PM, Tyson Norris <tnorris@adobe.com<mailto:tnorris@adobe.com>>
wrote:

Thanks everyone for the feedback.

I’d be happy to join a call -


A couple of details on the proposal that may or may not be clear:
- no changes to existing behavior without explicit adoption by the action developer or function
client (e.g. developer would have to “allow” the function to receive concurrent activation)
- integrate this support at the load balancer level - instead of publishing to a Kafka topic
for an invoker, publish to a container that was launched by an invoker. There is also no reason
that multiple load balancers cannot be active, lending to “no changes to existing behavior”




On Jul 4, 2017, at 6:55 AM, Michael Marth <mmarth@adobe.com.INVALID<mailto:mmarth@adobe.com.INVALID>>
wrote:

Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they get to respond.
I have worked with them on this topic, so let me jump in and comment until they are able to
reply.

From my POV having a call like you suggest is a really good idea. Let’s wait for Tyson &
Dragos to chime in to find a date.

As you mention the discussion so far was jumping across different topics, especially the use
case, the problem to be solved and the proposed solution. In preparation of the call I think
we can clarify use case and problem on the list. Here’s my view:

Use Case

For us the use case can be summarised with “dynamic, high performance websites/mobile apps”.
This implies:
1 High concurrency, i.e. Many requests coming in at the same time
2 The code to be executed is the same code across these different requests (as opposed to
a long tail distribution of many different actions being executed concurrently). In our case
“many” would mean “hundreds” or a few thousand.
3 The latency (time to start execution) matters, because human users are waiting for the response.
Ideally, in these order of magnitudes of concurrent requests the latency should not change
much.

All 3 requirements need to be satisfied for this use case.
In the discussion so far it was mentioned that there are other use cases which might have
similar requirements. That’s great and I do not want to rule them out, obviously. The above
is just to make it clear from where we are coming from.

At this point I would like to mention that it is my understanding that this use case is within
OpenWhisk’s strike zone, i.e. Something that we all think is reasonable to support. Please
speak up if you disagree.

The Problem

One can look at the problem in two ways:
Either you keep the resources of the OW system constant (i.e. No scaling). In that case latency
increases very quickly as demonstrated by Tyson’s tests.
Or you increase the system’s capacity. In that case the amount of machines to satisfy this
use case quickly becomes prohibitively expensive to run for the OW operator – where expensive
is defined as “compared to traditional web servers” (in our case a standard Node.js server).
Meaning, you need 100-1000 concurrent action containers to serve what can be served by 1 or
2 Node.js containers.

Of course, the proposed solution is not a fundamental “fix” for the above. It would only
move the needle ~2 orders of magnitude – so that the current problem would not be a problem
in reality anymore (and simply remain as a theoretical problem). For me that would be good
enough.

The solution approach

Would not like to comment on the proposed solution’s details (and leave that to Dragos and
Tyson). However, it was mentioned that the approach would change the programming model for
users:
Our mindset and approach was that we explicitly do not want  to change how OpenWhisk exposes
itself to users. Meaning, users should still be able to use NPMs, etc  - i.e. This would be
an internal implementation detail that is not visible for users. (we can make things more
explicit to users and e.g. Have them requests a special concurrent runtime if we wish to do
so – so far we tried to make it transparent to users, though).

Many thanks
Michael



On 03/07/17 14:48, "Jeremias Werner" <jeremias.werner@gmail.com<mailto:jeremias.werner@gmail.com><mailto:jeremias.werner@gmail.com>>
wrote:

Hi

Thanks for the write-up and the proposal. I think this is a nice idea and
sounds like a nice way of increasing throughput. Reading through the thread
it feels like there are different topics/problems mixed-up and the
discussion is becoming very complex already.

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us<http://zoom.us/> session where we first give Tyson and Dragos the
chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us<http://zoom.us/> session since I liked the one we had a few weeks ago.
It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias


On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah <rodric@gmail.com<mailto:rodric@gmail.com><mailto:rodric@gmail.com>>
wrote:

You're discounting with event driven all use cases that are still latency
sensitive because they complete a response by call back or actuation at
completion. IoT, chatbots, notifications, all examples in addition to ui
which are latency sensitive and having uniform expectations on queuing time
is of value.

-r


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message