kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dong Lin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-6618) Prevent two controllers from updating znodes concurrently
Date Wed, 07 Mar 2018 06:42:00 GMT
Dong Lin created KAFKA-6618:
-------------------------------

             Summary: Prevent two controllers from updating znodes concurrently
                 Key: KAFKA-6618
                 URL: https://issues.apache.org/jira/browse/KAFKA-6618
             Project: Kafka
          Issue Type: Improvement
            Reporter: Dong Lin
            Assignee: Dong Lin


Kafka controller may fail to function properly (even after repeated controller movement)
due to the following sequence of events:

- User requests topic deletion
- Controller A deletes the partition znode
- Controller B becomes controller and reads the topic znode
- Controller A deletes the topic znode and remove the topic from the topic deletion znode
- Controller B reads the partition znode and topic deletion znode
- According to controller B's context, the topic znode exists, the topic is not listed for
deletion, and some partition is not found for the given topic. Then controller B will create
topic znode with empty data (i.e. partition assignment) and create the partition znodes.
- All controller after controller B will fail because there is not data in the topic znode.

The long term solution is to have a way to prevent old controller from writing to zookeeper
if it is not the active controller. One idea is to use the zookeeper multi API (See [https://zookeeper.apache.org/doc/r3.4.3/api/org/apache/zookeeper/ZooKeeper.html#multi(java.lang.Iterable))] such
that controller only writes to zookeeper if the zk version of the controller znode has not
been changed.

The short term solution is to let controller reads the topic deletion znode first. If the
topic is still listed in the topic deletion znode, then the new controller will properly handle
partition states of this topic without creating partition znodes for this topic. And if the
topic is not listed in the topic deletion znode, then both the topic znode and the partition
znodes of this topic should have been deleted by the time the new controller tries to read
them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message