kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4556) unordered messages when multiple topics are combined in single topic through stream
Date Tue, 10 Jan 2017 06:41:58 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814086#comment-15814086
] 

Ewen Cheslack-Postava commented on KAFKA-4556:
----------------------------------------------

The reason you are seeing inconsistent timestamp ordering is because ordering is only guaranateed
*within a topic partition and per producer*. In your topology, even if you have only 1 partition
for Topic1 and Topic2, those two topics are independent. The producer at the top produces
data into each. That data may go to completely different brokers (and delivery of that data
could potentially be delayed, e.g. if the leader of the topic partitions are failing over
or there is a network partition).

Downstream, the consumer that is reading from both topics (each still just 1 partition), may
see data arrive from the two topics in different orders. It could be due to when the data
finally arrives at the broker from the producers; it could be due to delays in the network
between the consumer and the different brokers handling the different topic partitions; it
could be due to a network partition that temporarily makes one of the topics unreachable by
that specific consumer. Regardless, while the data within Topic1 (assuming a single partition)
will be seen in order and the data within Topic2 (again, assuming a single partition) will
be seen in order, the data across each could be arbitrarily interleaved.

This is one of the things that makes the streams API and its support for proper windowing
using event time and handling late-arriving data powerful -- despite these arbitrary interleavings,
it will be able to compute a correct "final" value for aggregates/joins/etc over a window.

> unordered messages when multiple topics are combined in single topic through stream
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-4556
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4556
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, producer , streams
>    Affects Versions: 0.10.0.1
>            Reporter: Savdeep Singh
>         Attachments: stream topology.png
>
>
> When binding builder with multiple topics, single resultant topic has unordered set of
messages.
> This issue is at millisecond level. When messages with same milisecond level are added
in topics.
> Scenario :  (1 producer : p1 , 2 topics t1 and t2, streams pick form these 2 topics and
save in resulting t3 topic, 4 partitions of t3 and 4 consumers of 4 partitions )
> Case: When p1 adds messages with same millisecond timestamp into t1 and t2 . Stream combine
and form t3. When this t3 is consumed by consumer, it has different order of same millisecond
messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message