From dev-return-2236-apmail-openwhisk-dev-archive=openwhisk.apache.org@openwhisk.apache.org Wed Jul 18 13:46:08 2018 Return-Path: X-Original-To: apmail-openwhisk-dev-archive@minotaur.apache.org Delivered-To: apmail-openwhisk-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A1B5518A5A for ; Wed, 18 Jul 2018 13:46:08 +0000 (UTC) Received: (qmail 49526 invoked by uid 500); 18 Jul 2018 13:46:08 -0000 Delivered-To: apmail-openwhisk-dev-archive@openwhisk.apache.org Received: (qmail 49472 invoked by uid 500); 18 Jul 2018 13:46:08 -0000 Mailing-List: contact dev-help@openwhisk.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@openwhisk.apache.org Delivered-To: mailing list dev@openwhisk.apache.org Received: (qmail 49460 invoked by uid 99); 18 Jul 2018 13:46:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jul 2018 13:46:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6967AC00D3 for ; Wed, 18 Jul 2018 13:46:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -2.302 X-Spam-Level: X-Spam-Status: No, score=-2.302 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id YSKnNxabLnBi for ; Wed, 18 Jul 2018 13:46:05 +0000 (UTC) Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 67A1A5F403 for ; Wed, 18 Jul 2018 13:46:05 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EA2588315E; Wed, 18 Jul 2018 13:45:58 +0000 (UTC) Received: from [10.40.5.53] (dhcp-10-40-5-53.brq.redhat.com [10.40.5.53]) by smtp.corp.redhat.com (Postfix) with ESMTP id D37CA111AF0C; Wed, 18 Jul 2018 13:45:57 +0000 (UTC) Subject: Re: Proposal on a future architecture of OpenWhisk To: dev@openwhisk.apache.org, Markus Thoemmes References: <0c9d2760-bc16-a5a2-7ee6-72681aa1ccde@redhat.com> From: Martin Gencur Message-ID: Date: Wed, 18 Jul 2018 15:45:57 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 18 Jul 2018 13:45:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 18 Jul 2018 13:45:58 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mgencur@redhat.com' RCPT:'' On 18.7.2018 14:41, Markus Thoemmes wrote: > Hi Martin, > > thanks for the great questions :) > >> thinking about scalability and the edge case. When there are not >> enough >> containers and new controllers are being created, and all of them >> redirect traffic to the controllers with containers, doesn't it mean >> overloading the available containers a lot? I'm curious how we >> throttle the traffic in this case. > True, the first few requests will overload the controller that owns the very first container. That one will request new containers immediately, which will then be distributed to all existing Controllers by the ContainerManager. An interesting wrinkle here is, that you'd want the overloading requests to be completed by the Controllers that sent it to the "single-owning-Controller". Ah, got it. So it is a pretty common scenario. Scaling out controllers and containers. I thought this is a case where we reach a limit of created containers and no more containers can be created. > What we could do here is: > > Controller0 owns ContainerA1 > Controller1 relays requests for A to Controller0 > Controller0 has more requests than it can handle, so it requests additional containers. All requests coming from Controller1 will be completed with a predefined message (for example "HTTP 503 overloaded" with a specific header say "X-Return-To-Sender-By: Controller0") > Controller1 recognizes this as "okay, I'll wait for containers to appear", which will eventually happen (because Controller0 has already requested them) so it can route and complete those requests on its own. > Controller1 will now no longer relay requests to Controller0 but will request containers itself (acknowledging that Controller0 is already overloaded). Yeah, I think it makes sense. > >> I guess the other approach would be to block creating new controllers >> when there are no containers available as long as we don't want to >> overload the existing containers. And keep the overflowing workload >> in Kafka as well. > Right, the second possibility is to use a pub/sub (not necessarily Kafka) queue between Controllers. Controller0 subscribes to a topic for action A because it owns a container for it. Controller1 doesn't own a container (yet) and publishes a message as overflow to topic A. The wrinkle in this case is, that Controller0 can't complete the request but needs to send it back to Controller1 (where the HTTP connection is opened from the client). > > Does that make sense? I was rather thinking about blocking the creation of Controller1 in this case and responding to the client that the system is overloaded. But the first approach seems better because it's a pretty common use case (not reaching the limit of created containers). Thanks! Martin > > Cheers, > Markus >