kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-2894) WorkerSinkTask doesn't handle rewinding offsets on rebalance
Date Wed, 15 Jun 2016 17:24:09 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ewen Cheslack-Postava updated KAFKA-2894:
         Priority: Blocker  (was: Major)
    Fix Version/s:

More generally, I think we need to handle cases like rewind() after *any* connector methods
are invoked. The rewind() in poll() handles Connector.start() and Connector.put(), but we
also need to handle the rebalance callbacks (where we can ignore Connector.close() and only
do this after Connector.open()) and Connector.flush().

> WorkerSinkTask doesn't handle rewinding offsets on rebalance
> ------------------------------------------------------------
>                 Key: KAFKA-2894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2894
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions:
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Liquan Pei
>            Priority: Blocker
>             Fix For:
> rewind() is only invoked at the beginning of each poll(). This means that if a rebalance
occurs in the poll, it's feasible to get data that doesn't match a request to change offsets
during the rebalance. I think the consumer will hold on to consumer data across the rebalance
if it is reassigned the same offset, so there may already be data ready to be delivered. Additionally
we may already have data in an incomplete messageBatch that should be discarded when the rewind
is requested.
> While connectors that care about this (i.e. ones that manage their own offsets) can handle
this correctly by tracking the offsets they're expecting to see, it's a hassle, error prone,
an pretty unintuitive.

This message was sent by Atlassian JIRA

View raw message