kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mateusz Owczarek (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-7669) Stream topology definition is not prune to the ordering changes
Date Thu, 22 Nov 2018 16:37:00 GMT
Mateusz Owczarek created KAFKA-7669:
---------------------------------------

             Summary: Stream topology definition is not prune to the ordering changes
                 Key: KAFKA-7669
                 URL: https://issues.apache.org/jira/browse/KAFKA-7669
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 2.0.0
            Reporter: Mateusz Owczarek


It seems that if the user does not guarantee the order of the stream topology definition,
he may end up with multiple stream branches having the same internal changelog (and repartition,
if created) topic. 

Let's assume:
{code:java}
val initialStream = new StreamsBuilder().stream(sth);
val someStrings = (1 to 10).map(_.toString)
val notGuaranteedOrderOfStreams: Map[String, KStream[...]] = someStrings.map(s => s ->
initialStream.filter(...)).toMap{code}

When the user defines now common aggregation logic for the notGuaranteedOrderOfStreams, and
runs multiple instances of the application the KSTREAM-AGGREGATE-STATE-STORE topics names
will not be unique and will contain results of the different streams from notGuaranteedOrderOfStreams
map.

All of this without a single warning that the topology (or just the order of the topology
definition) differs in different instances of the Kafka Streams application.

Also, I am concerned that ids in "KSTREAM-AGGREGATE-STATE-STORE-id-changelog " match so well
for the different application instances (and different topologies).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message