kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apurva Mehta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5396) Transactions System Test: Consumer reading uninterrupted from beginning of log can read the same message multiple times.
Date Wed, 07 Jun 2017 19:52:18 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041501#comment-16041501
] 

Apurva Mehta commented on KAFKA-5396:
-------------------------------------

Yes. This is a test issue, so it isn't a blocker for the release. However, since it will affect
test stability we should merge the fix into the 0.11.0.0 branch.

> Transactions System Test: Consumer reading uninterrupted from beginning of log can read
the same message multiple times.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5396
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5396
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>             Fix For: 0.11.0.0
>
>         Attachments: KAFKA-5396.tar.gz
>
>
> I noticed this when running the transactions system test with hard broker bounces. We
have a consumer in READ_COMMITTED mode reading from the tail of the log as the writes are
appended.
> This test has failed once because the concurrent consumer returned duplicate data. The
actual log has no duplicates, so the problem is in the consumer. 
> One of the duplicate values is '0', and is at offset 250 in output-topic-1. The first
time it is read, we see the following.
> {noformat}
> [2017-06-07 05:50:34,601] TRACE Returning fetched records at offset 0 for assigned partition
output-topic-0 and update position to 250 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,602] TRACE Preparing to read 2967 bytes of data for partition output-topic-1
with offset 250 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,602] TRACE Updating high watermark for partition output-topic-1
to 502 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:50:34,613] TRACE Returning fetched records at offset 250 for assigned
partition output-topic-1 and update position to 500 (org.apache.kafka.clients.consumer.internals.Fetcher)
> {noformat}
> The next time it is read, we see this
> {noformat}
> [2017-06-07 05:51:36,386] TRACE Preparing to read 169858 bytes of data for partition
output-topic-1 with offset 0 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:51:36,389] TRACE Updating high watermark for partition output-topic-1
to 13053 (org.apache.kafka.clients.consumer.internals.Fetcher)
> [2017-06-07 05:51:36,391] TRACE Returning fetched records at offset 0 for assigned partition
output-topic-1 and update position to 500 (org.apache.kafka.clients.consumer.internals.Fetcher)
> {noformat}
> For some reason, the fetcher re-sent the data from offset 0, an reset the position to
500. 
> This is the plain consumer doing 'poll' in a loop until it is killed. So this position
reset is puzzling. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message