kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liquan Pei (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (KAFKA-2894) WorkerSinkTask doesn't handle rewinding offsets on rebalance
Date Wed, 01 Jun 2016 04:19:12 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Liquan Pei reassigned KAFKA-2894:

    Assignee: Liquan Pei  (was: Ewen Cheslack-Postava)

> WorkerSinkTask doesn't handle rewinding offsets on rebalance
> ------------------------------------------------------------
>                 Key: KAFKA-2894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2894
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions:
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Liquan Pei
> rewind() is only invoked at the beginning of each poll(). This means that if a rebalance
occurs in the poll, it's feasible to get data that doesn't match a request to change offsets
during the rebalance. I think the consumer will hold on to consumer data across the rebalance
if it is reassigned the same offset, so there may already be data ready to be delivered. Additionally
we may already have data in an incomplete messageBatch that should be discarded when the rewind
is requested.
> While connectors that care about this (i.e. ones that manage their own offsets) can handle
this correctly by tracking the offsets they're expecting to see, it's a hassle, error prone,
an pretty unintuitive.

This message was sent by Atlassian JIRA

View raw message