kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Chambers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5836) Kafka Streams - API for specifying internal stream name on join
Date Mon, 16 Oct 2017 18:27:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206353#comment-16206353

Andy Chambers commented on KAFKA-5836:

Ah sorry. Not KS of course but the confluent AvroSerializer which registers the schema as
"[topic-name]-value". So to be completely concrete, the example below represents some pseudo-code
for building a couple of small topologies. It demonstrates the scenario of wanting to evolve
from v1 to v2 of some app. It is assumed that all topics use the confluent schema registry's
builtin KafkaAvroSerializer for serialization. And that backwards-compatibility checking is
enabled on the schema registry. Topic "a" expects messages from schema "a", topic "b" expects
messages from schema "b" etc...

In app-v1:
  a = topic-in(a)
  b = topic-in(b)
  result = window-join(a, b) # implicitly creates topics for each of the input topics named
                                           # "${applicationId}-storeName-changelog" and because
                                           # using avro+schema-registry, we get corresponding
                                           # of the same name in the schema registry. Lets
say that in this
                                           # case, we get implicit topics where
                                           #    storeName=window-join-0001-changelog (holds
topic a)
                                           #    storeName=window-join-0002-changelog (holds
topic b)

In app-v2:
  a = topic-in(a)
  b = topic-in(b)

  # add some unrelated stuff to the topology but since it's a window join, in the same position
  # as the previous window the internal topics are the same but the topics involved in this
  # are different. So now we have
  #    storeName=window-join-0001-changelog (holds topic c)
  #    storeName=window-join-0002-changelog (holds topic d)
  # When the KafkaAvroSerializer tries to serialize messages destined for these internal topics
  # it will fail because the schema registry is expecting messages adhering to schema a/b
  # will actually get messages matching schema c/d. The serializer will attempt to register
  # the "new" schemas and fail because they are not "backward compatible" with a/b.
  window-join(topic-in(c), topic-in(d))

  result = window-join(a, b) 

> Kafka Streams - API for specifying internal stream name on join
> ---------------------------------------------------------------
>                 Key: KAFKA-5836
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5836
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>    Affects Versions:
>            Reporter: Lovro Pandžić
>              Labels: api, needs-kip
> Automatic topic name can be problematic in case of streams operation change/migration.
> I'd like to be able to specify name of an internal topic so I can avoid creation of new
stream and data "loss" when changing the Stream building.

This message was sent by Atlassian JIRA

View raw message