kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Cranford (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-8665) WorkerSourceTask race condition when rebalance occurs before task has started
Date Mon, 15 Jul 2019 16:42:00 GMT
Chris Cranford created KAFKA-8665:

             Summary: WorkerSourceTask race condition when rebalance occurs before task has
                 Key: KAFKA-8665
                 URL: https://issues.apache.org/jira/browse/KAFKA-8665
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
    Affects Versions: 2.2.0
            Reporter: Chris Cranford

In our project we have several {{SourceTask}} implementations that perform a set of sequential
steps in the {{SourceTask#start}} call.  It's possible that during these sequential operations,
a rebalance occurs which leads to the situation where the {{WorkerSourceTask}} transitions
to the state where {{startedShutdownBeforeStartCompleted=true}} and then the runtime starts
a brand new task.

This is mostly a problem around specific named resources that are registered when a call to
{{SourceTask#start}} happens and those same resources are unregistered later when the call
to {{SourceTask#stop}} occurs. 

For us specifically, this is a problem with JMX resource registration/unregistration.  We
register those beans at the end of the call to {{SourceTask#start}} and unregister in the
call to {{stop}}.  Due to the order of start/stop pairs combined with where a rebalance is
triggered, this leads to 
 # Register JMX beans when SourceTask A1 is started.
 # Register JMX beans when SourceTask A2 is started with rebalance.
 ## JMX beans failed to register as they're already registered.
 # SourceTask A1 finally stops, triggers unregistering JMX beans


In our use case we're experiencing a problem with the registration/unregistration of JMX resources
with the nature of how a rebalance is triggered while the task hasn't yet fully started and
never gets stopped

This message was sent by Atlassian JIRA

View raw message