kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Onur Karaman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1120) Controller could miss a broker state change
Date Fri, 04 Aug 2017 20:16:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114911#comment-16114911

Onur Karaman commented on KAFKA-1120:

[~wushujames] I think Jun's comments and the redesign doc in KAFKA-5027 are sort of saying
the same thing. The broker-generation concept has two use cases which was sort of implied:
1. the controller using broker generations to distinguish events from a broker across generations.
2. controller-to-broker requests should include broker generation so that brokers can ignore
requests that applied to its former generation.

While I think czxid's will work for the 1st use case, I don't think we can naively reuse czxid
for the 2nd use case. The reason is a bit silly: zookeeper's CreateResponse only provides
the path. It doesn't provide the created znode's Stat, So you have to do a later lookup to
find out the znode's czxid.

If we want to solve both use cases with the same approach, I think we have a couple of options:
1. maybe we can get away with using czxids by doing a multi-op when registering brokers to
transactionally create a znode and read that same znode to read the czxid of the znode it
just created.
2. we can instead use the session id as the broker generation. The controller can infer the
broker's generation by observing the broker znode's ephemeralOwner property. Brokers can determine
their generation id by looking up the underlying zookeeper client's session id which is just
ZooKeeper.getSessionId(). The ephemeralOwner of an ephemeral znode its the client's session
id which is why this would work.

> Controller could miss a broker state change 
> --------------------------------------------
>                 Key: KAFKA-1120
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1120
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.1
>            Reporter: Jun Rao
>              Labels: reliability
>             Fix For: 1.0.0
> When the controller is in the middle of processing a task (e.g., preferred leader election,
broker change), it holds a controller lock. During this time, a broker could have de-registered
and re-registered itself in ZK. After the controller finishes processing the current task,
it will start processing the logic in the broker change listener. However, it will see no
broker change and therefore won't do anything to the restarted broker. This broker will be
in a weird state since the controller doesn't inform it to become the leader of any partition.
Yet, the cached metadata in other brokers could still list that broker as the leader for some
partitions. Client requests routed to that broker will then get a TopicOrPartitionNotExistException.
This broker will continue to be in this bad state until it's restarted again.

This message was sent by Atlassian JIRA

View raw message