kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
Date Wed, 28 Mar 2018 22:33:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418207#comment-16418207

Matthias J. Sax commented on KAFKA-6437:

I mean that some topics are available but other are not (ie if there are multiple input topics).
There are case for which Kafka Streams would not fail but just process the available topics

I agree that KAFAK-6520 is different; however, it's somehow related (-> state "RUNNING"
is confusion and not really appropriate). Just wanted to point out the relationship. Not sure,
if we should introduce DISCONNECTED and IDLE or just one state for both. I mentioned it to
get a "global picture" only.

> Streams does not warn about missing input topics, but hangs
> -----------------------------------------------------------
>                 Key: KAFKA-6437
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6437
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 1.0.0
>         Environment: Single client on single node broker
>            Reporter: Chris Schwarzfischer
>            Assignee: Mariam John
>            Priority: Minor
>              Labels: newbie
> *Case*
> Streams application with two input topics being used for a left join.
> When the left side topic is missing upon starting the streams application, it hangs "in
the middle" of the topology (at …00009, see below). Only parts of the intermediate topics
are created (up to …00009)
> When the missing input topic is created, the streams application resumes processing.
> {noformat}
> Topology:
> StreamsTask taskId: 2_0
> 	ProcessorTopology:
> 		KSTREAM-SOURCE-0000000011:
> 			topics:		[mystreams_app-KTABLE-AGGREGATE-STATE-STORE-0000000009-repartition]
> 			children:	[KTABLE-AGGREGATE-0000000012]
> 		KTABLE-AGGREGATE-0000000012:
> 			states:		[KTABLE-AGGREGATE-STATE-STORE-0000000009]
> 			children:	[KTABLE-TOSTREAM-0000000020]
> 		KTABLE-TOSTREAM-0000000020:
> 			children:	[KSTREAM-SINK-0000000021]
> 		KSTREAM-SINK-0000000021:
> 			topic:		data_udr_month_customer_aggregration
> 		KSTREAM-SOURCE-0000000017:
> 			topics:		[mystreams_app-KSTREAM-MAP-0000000014-repartition]
> 			children:	[KSTREAM-LEFTJOIN-0000000018]
> 		KSTREAM-LEFTJOIN-0000000018:
> 			states:		[KTABLE-AGGREGATE-STATE-STORE-0000000009]
> 			children:	[KSTREAM-SINK-0000000019]
> 		KSTREAM-SINK-0000000019:
> 			topic:		data_UDR_joined
> Partitions [mystreams_app-KSTREAM-MAP-0000000014-repartition-0, mystreams_app-KTABLE-AGGREGATE-STATE-STORE-0000000009-repartition-0]
> {noformat}
> *Why this matters*
> The applications does quite a lot of preprocessing before joining with the missing input
topic. This preprocessing won't happen without the topic, creating a huge backlog of data.
> *Fix*
> Issue an `warn` or `error` level message at start to inform about the missing topic and
it's consequences.

This message was sent by Atlassian JIRA

View raw message