kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Chambers (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-5836) Kafka Streams - API for specifying internal stream name on join
Date Mon, 16 Oct 2017 18:28:01 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206353#comment-16206353
] 

Andy Chambers edited comment on KAFKA-5836 at 10/16/17 6:27 PM:
----------------------------------------------------------------

Ah sorry. Not KS of course but the confluent AvroSerializer which registers the schema as
"[topic-name]-value". So to be completely concrete, the example below represents some pseudo-code
for building a couple of small topologies. It demonstrates the scenario of wanting to evolve
from v1 to v2 of some app. It is assumed that all topics use the confluent schema registry's
builtin KafkaAvroSerializer for serialization. And that backwards-compatibility checking is
enabled on the schema registry. Topic "a" expects messages from schema "a", topic "b" expects
messages from schema "b" etc...

In app-v1:
  a = stream-of(a)
  b = stream-of(b)
  result = window-join(a, b) # implicitly creates topics for each of the input topics named
                                           # "${applicationId}-storeName-changelog" and because
we're
                                           # using avro+schema-registry, we get corresponding
subjects
                                           # of the same name in the schema registry. Lets
say that in this
                                           # case, we get implicit topics where
                                           #    storeName=window-join-0001-changelog (holds
topic a)
                                           #    storeName=window-join-0002-changelog (holds
topic b)

In app-v2:
  a = stream-of(a)
  b = stream-of(b)

  # add some unrelated stuff to the topology but since it's a window join, in the same position
  # as the previous window the internal topics are the same but the topics involved in this
join
  # are different. So now we have
  #    storeName=window-join-0001-changelog (holds topic c)
  #    storeName=window-join-0002-changelog (holds topic d)
  # When the KafkaAvroSerializer tries to serialize messages destined for these internal topics
  # it will fail because the schema registry is expecting messages adhering to schema a/b
but
  # will actually get messages matching schema c/d. The serializer will attempt to register
the
  # the "new" schemas and fail because they are not "backward compatible" with a/b.
  window-join(stream-of(c), stream-of(d))
     .foreach(spam-logs)

  result = window-join(a, b) 


was (Author: andy.chambers@fundingcircle.com):
Ah sorry. Not KS of course but the confluent AvroSerializer which registers the schema as
"[topic-name]-value". So to be completely concrete, the example below represents some pseudo-code
for building a couple of small topologies. It demonstrates the scenario of wanting to evolve
from v1 to v2 of some app. It is assumed that all topics use the confluent schema registry's
builtin KafkaAvroSerializer for serialization. And that backwards-compatibility checking is
enabled on the schema registry. Topic "a" expects messages from schema "a", topic "b" expects
messages from schema "b" etc...

In app-v1:
  a = topic-in(a)
  b = topic-in(b)
  result = window-join(a, b) # implicitly creates topics for each of the input topics named
                                           # "${applicationId}-storeName-changelog" and because
we're
                                           # using avro+schema-registry, we get corresponding
subjects
                                           # of the same name in the schema registry. Lets
say that in this
                                           # case, we get implicit topics where
                                           #    storeName=window-join-0001-changelog (holds
topic a)
                                           #    storeName=window-join-0002-changelog (holds
topic b)

In app-v2:
  a = topic-in(a)
  b = topic-in(b)

  # add some unrelated stuff to the topology but since it's a window join, in the same position
  # as the previous window the internal topics are the same but the topics involved in this
join
  # are different. So now we have
  #    storeName=window-join-0001-changelog (holds topic c)
  #    storeName=window-join-0002-changelog (holds topic d)
  # When the KafkaAvroSerializer tries to serialize messages destined for these internal topics
  # it will fail because the schema registry is expecting messages adhering to schema a/b
but
  # will actually get messages matching schema c/d. The serializer will attempt to register
the
  # the "new" schemas and fail because they are not "backward compatible" with a/b.
  window-join(topic-in(c), topic-in(d))
     .foreach(spam-logs)

  result = window-join(a, b) 

> Kafka Streams - API for specifying internal stream name on join
> ---------------------------------------------------------------
>
>                 Key: KAFKA-5836
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5836
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>    Affects Versions: 0.11.0.0
>            Reporter: Lovro Pandžić
>              Labels: api, needs-kip
>
> Automatic topic name can be problematic in case of streams operation change/migration.
> I'd like to be able to specify name of an internal topic so I can avoid creation of new
stream and data "loss" when changing the Stream building.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message