kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-6039) Improve TaskAssignor to be more load balanced
Date Tue, 10 Oct 2017 00:57:00 GMT
Guozhang Wang created KAFKA-6039:

             Summary: Improve TaskAssignor to be more load balanced
                 Key: KAFKA-6039
                 URL: https://issues.apache.org/jira/browse/KAFKA-6039
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: Guozhang Wang

Today our task placement may still generate sub-optimal assignment regarding load balance.
One reason is that it does not account for sub-topologies. For example say you have an aggregation
following from a repartition topic, then you will end up with two sub-topologies where the
first one is very light and the second one is computational heavy with state stores, however
when we consider their tasks we treat them equally so in the worst case one client can get
X number of tasks from first sub-topology and be very idle while the other getting X number
of tasks from the second sub-topology and busy to death.

One strawman approach to make this better is try to achieve balance across sub-topologies:
i.e. each client trying to get similar amount of tasks within a sub-topology. However there
are some more considerations to include (as mentioned in the sub-taks).

This message was sent by Atlassian JIRA

View raw message