openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Mehrotra <chetan.mehro...@gmail.com>
Subject Re: Backpressure for slow activation storage in Invoker
Date Mon, 16 Sep 2019 15:32:18 GMT
Hi Tyson,

> in case of logs NOT in db: when queue full, publish non-blocking to "completed-non-blocking"

The approach I was thinking was to completely disable (configurable)
support for persisting activation from Invoker and instead handle all
such work via activation persister service.

Supporting a queue full based approach is tricky as it would be hard
to indicate which all activation in Kafka completed queue are due to
queue being full as we store activation after active ack. Otherwise
ContainerProxy has to first place item in queue and see if full then
add some marker to activation being sent on "completed" queue to
indicate its for overflow case

Chetan Mehrotra

On Fri, Sep 13, 2019 at 3:14 AM Tyson Norris <tnorris@adobe.com.invalid> wrote:
>
> I think this sounds good, but want to be clear I understand the consumers and producers
involved - is this summary correct?
>
> Controller:
> * consumes "completed-<controllerid>" topic (as usual)
> Invoker:
> * in case of logs NOT in db: when queue full, publish non-blocking to "completed-non-blocking"
> *in case of logs in db: when queue full, publish all to "Activations" topic
> OverflownActivationRecorderService (new service):
> * in case of logs NOT in db: consumes "completed-*" topic(s) AND "completed-non-blocking"
topic
> * in case of logs in db: consumes "Activations" topic
>
> Thanks!
> Tyson
>
> On 9/11/19, 4:51 AM, "Chetan Mehrotra" <chetan.mehrotra@gmail.com> wrote:
>
>     As part of implementing this feature I came across support for topic
>     patterns in Kafka [1] [2]. It seems to allow listening to multiple
>     topics by same or a group of consumer. So after discussing with Sven
>     (thanks Sven!) I came up with following proposal
>
>     With this I think we can go back to "Option B1 - Activations via
>     controller topic" and thus subscribe to "completed-.*" pattern.
>
>     This would help by avoiding any extra load on Kafka as we consumer
>     same activation result messages as being sent to Controller. However
>     there are few caveats
>
>     1. Currently we send activation result via Kafka only for blocking calls
>     2. Result send does not contain logs
>
>     So we can possibly have support for 2 modes
>
>     Option CB1 - Existing topic + new topic for non blocking result
>     -------------------
>
>     This mode would be used if the setup does not record the logs in db.
>     In this mode we would add support in Invoker to also send result for
>     non blocking calls to a new "completed-non-blocking" topic and then
>     listen for "completed-.*"
>
>     Option CB2 - New topic + KafkaActivationStore
>     ------------------
>     This mode can be used if setup stores logs in db. Here we would have a
>     new KafkaActivationStore which would send the activations to a new
>     "activations" topic
>
>     The ActivationPersister service can support both modes and cluster
>     operator can configure it in required mode
>
>     Chetan Mehrotra
>     [1] https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.akka.io%2Fdocs%2Falpakka-kafka%2Fcurrent%2Fsubscription.html%23topic-pattern&amp;data=02%7C01%7Ctnorris%40adobe.com%7C9381bd5b8c0845ced67608d736ae5029%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637037994611727272&amp;sdata=pKognLhE6vFlE4k6ztn0%2BnYmnyVBi%2FFkD1NhN6PkkeI%3D&amp;reserved=0
>     [2] https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkafka.apache.org%2F11%2Fjavadoc%2Forg%2Fapache%2Fkafka%2Fclients%2Fconsumer%2FKafkaConsumer.html%23subscribe-java.util.regex.Pattern-org.apache.kafka.clients.consumer.ConsumerRebalanceListener-&amp;data=02%7C01%7Ctnorris%40adobe.com%7C9381bd5b8c0845ced67608d736ae5029%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637037994611727272&amp;sdata=SJIKaxcjtscX9FUjkUWdVTFN3Y3mmJfwNQUCJOKnqNg%3D&amp;reserved=0
>
>     On Mon, Jun 24, 2019 at 11:57 PM Chetan Mehrotra
>     <chetan.mehrotra@gmail.com> wrote:
>     >
>     > > For B1, we can scale out the service as controllers are scaled out, but
it
>     > > would be much complex to manually assign topics.
>     >
>     > Yes thats what my concern was in B1. So would for now target B2
>     > approach where we have a dedicated new topic and then have it consumed
>     > by a new service.  If it poses problem down the line then we can go
>     > for B1. B
>     >
>     > Chetan Mehrotra
>     >
>     > On Tue, Jun 25, 2019 at 10:08 AM Dominic Kim <style9595@gmail.com> wrote:
>     > >
>     > > Let me share a few ideas on them.
>     > >
>     > > Regarding option B1, I think it can scale out better than option B2.
>     > > If I understood correctly, scaling out of the service will be highly
>     > > dependent on Kafka.
>     > > Since the number of consumers is limited to the number of partitions, the
>     > > number of service nodes will be also limited to the number of partitions.
>     > >
>     > > So in the case of B2, if we create a new topic with some partition numbers,
>     > > we cannot scale out the service nodes more than that.
>     > > At some point, we may need to alter the number of partitions and it's not
>     > > easy in Kafka.
>     > > (Since the activation processing here is asynchronous, we may bear some
>     > > downtime(1~2s) to alter the partition. Then it would be fine.)
>     > >
>     > > In the case of B1, there will be many controller topics with their own
>     > > partitions.
>     > > Since controllers can be scaled out, there will be more topics, and the
>     > > activation service can scale out accordingly.
>     > > But in this case, we need to manually control the topic assignment.
>     > > (Not partition assignment, it will be done by Kafka.)
>     > >
>     > > Let's say we have 3 controller topics with 2 partitions each.
>     > > For HA, it would be great to have at least two nodes.
>     > > At first, both nodes will take care of all three topics.
>     > > Based on the partition assignment plan in Kafka, both nodes will fetch
>     > > activation messages without any duplication.
>     > > As controllers are scaled out, two nodes may not be enough to take care
of
>     > > all topics.
>     > > At this point, we need to scale out the service nodes more.
>     > > Then we need to do logical partitioning for topics.
>     > >
>     > > For example, the node1 and 2 will take care of topic0 ~ 1 and node3 and
4
>     > > will take care of topic2 ~ 3.
>     > > In this way, we can guarantee the minimum HA and scale out the nodes as
>     > > well.
>     > > Among them, topic partitions will be also assigned by Kafka.
>     > >
>     > > So in short,
>     > > For B1, we can scale out the service as controllers are scaled out, but
it
>     > > would be much complex to manually assign topics.
>     > > And one node may have more than one Kafka consumers.
>     > >
>     > > For B2, scaling might be limited unless we have a big enough number of
>     > > partitions at topic creation time.
>     > > But if we can bear some downtime, this might not be a problem and this
>     > > option will be a lot simpler.
>     > >
>     > > Best regards
>     > > Dominic.
>     > >
>     > >
>     > >
>     > >
>     > > 2019년 6월 24일 (월) 오후 6:50, Chetan Mehrotra <chetan.mehrotra@gmail.com>님이
작성:
>     > >
>     > > > Okie so we can then aim for adding an optional support for storing
>     > > > activations via a separate service.
>     > > >
>     > > > Currently we also send the activation result on respective controller
>     > > > topic. With this change we would also be sending same activation
>     > > > record on another topic. So we have another choice to make
>     > > >
>     > > > Option B1 - Activations via controller topic
>     > > > --------------------------------------------------------
>     > > >
>     > > > Here we avoid a new topic and instead have a service which listen
to
>     > > > all controller topics for activation records. However that would be
>     > > > tricky to implement and also tricky to scale out. As scaling out such
>     > > > a service by running multiple copies would not be easy in terms of
>     > > > sharding/partitioning
>     > > >
>     > > > Here the benefit is that we reduce the duplicate writes on Kafka.
>     > > >
>     > > > Option B2 - Introduce a new topic altogether
>     > > > -----------------------------------------------------------
>     > > >
>     > > > We introduce a new topic to which all invokers write the activation
>     > > > records (like the case for user-events). Then implementing a new
>     > > > service to read from a single (possibly partitioned topic) would be
>     > > > easier.
>     > > >
>     > > > My suggestion is to go for B2 for now.
>     > > >
>     > > > Any feedback on that?
>     > > >
>     > > > Chetan Mehrotra
>     > > >
>     > > > On Fri, Jun 21, 2019 at 11:46 PM Rodric Rabbah <rodric@gmail.com>
wrote:
>     > > > >
>     > > > > > Can we handle these in same way as user events? Maybe exactly
like user
>     > > > > events, as in use a single service to process both topics.
>     > > > >
>     > > > > good call - the user events already contains much of the activation
>     > > > record
>     > > > > (if not all modulo the logs)?
>     > > > >
>     > > > > -r
>     > > >
>
>

Mime
View raw message