apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pramod Immaneni (JIRA)" <j...@apache.org>
Subject [jira] [Created] (APEXCORE-348) Load based stream partitioning
Date Wed, 17 Feb 2016 17:54:18 GMT
Pramod Immaneni created APEXCORE-348:
----------------------------------------

             Summary: Load based stream partitioning
                 Key: APEXCORE-348
                 URL: https://issues.apache.org/jira/browse/APEXCORE-348
             Project: Apache Apex Core
          Issue Type: Improvement
            Reporter: Pramod Immaneni
            Assignee: Pramod Immaneni


There are scenarios where the downstream partitions of an upstream operator are generally
not performing uniformly resulting in an overall sub-optimal performance dictated by the slowest
partitions. The reasons could be data related such as some partitions are receiving more data
to process than the others or could be environment related such as some partitions are running
slower than others because they are on heavily loaded nodes.

A solution based on currently available functionality in the engine would be to write a StreamCodec
implementation to distribute data among the partitions such that each partition is receiving
similar amount of data to process. We should consider adding StreamCodecs like these to the
library but these however do not solve the problem when it is environment related.

For that a better and more comprehensive approach would be look at how data is being consumed
by the downstream partitions from the BufferServer and use that information to make decisions
on how to send future data. If some partitions are behind others in consuming data then data
can be directed to the other partitions. One way to do this would be to relay this type of
statistical and positional information from BufferServer to the upstream publishers. The publishers
can use this information in ways such as making it available to StreamCodecs to affect destination
of future data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message