pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] rdhabalia opened a new pull request #2438: Fix: Function assignment can support large number of topics
Date Thu, 23 Aug 2018 23:57:30 GMT
rdhabalia opened a new pull request #2438: Fix: Function assignment can support large number
of topics
URL: https://github.com/apache/incubator-pulsar/pull/2438
   ### Motivation: 
   Pulsar function assignment has scalability and performance issue.
   Right now, [SchedulerManager](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/SchedulerManager.java#L154)
publishes all assignments into one pulsar message. Here, each assignment generates 700 bytes
of payload. Pulsar has limitation with pulsar messages-size which is around 5MB. Now, if each
function is running with 3 instances then it requires 2KB of payload so, function can only
support around 2500 functions in the cluster.
   Also, assignment event is something that happens more frequent in the system which can
be triggered on any assignment change or worker restart. So, over period of time, we can expect
large number of assignment messages stored across many ledgers in the system and every time
worker restart, it requires to read all those very old ledgers from BK which is something
we would also definitely like to avoid.
   We can easily reproduce it by registering function with parallelism=12000 which will fail
to publish assignment message. 
   ### Modification
   1. Publish multiple messages (each message with limited number of assignments) to include
all assignments to support any number of function assignments
   2. Acks the message  for old version of assignments (which requires separate namespace
for assignment which won't have infinite retention configured)
   3. Broker deletes ledgers for old assignments and assignment-reader doesn't have to read
such ledgers.
   ### Result
   1. Pulsar function can support any number of functions in the system
   2. Assignment-manager doesn't read old assignments so, broker and bookie can avoid unnecessary
read and dispatching

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message