kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Json Tu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4447) Controller resigned but it also acts as a controller for a long time
Date Fri, 25 Nov 2016 17:26:58 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696353#comment-15696353

Json Tu commented on KAFKA-4447:

after review the code of kafka and some new versions, can we add one judgement at
IsrChangeNotificationListener, PartitionsReassignedListener,PreferredReplicaElectionListener,ReassignedPartitionsIsrChangeListener
to avoid these appearance.
using IsrChangeNotificationListener as and example, in the handleChildChange function of IsrChangeNotificationListener,
we may add
it as below.

inLock(controller.controllerContext.controllerLock) {
    if(null == controllerContext.controllerChannelManager) {

> Controller resigned but it also acts as a controller for a long time 
> ---------------------------------------------------------------------
>                 Key: KAFKA-4447
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4447
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions:,,,
>         Environment: Linux Os
>            Reporter: Json Tu
>         Attachments: log.tar.gz
> We have a cluster with 10 nodes´╝îand we execute following operation as below.
> 1.we execute some topic partition reassign from one node to other 9 nodes in the cluster,
and which triggered controller.
> 2.controller invoke PartitionsReassignedListener's handleDataChange and read all partition
reassign rules from the zk path, and executed all onPartitionReassignment for all partition
that match conditions.
> 3.but the controller is expired from zk, after what some nodes of 9 nodes also expired
from zk.
> 5.then controller invoke onControllerResignation to resigned as the controller.
> we found after the controller is resigned, it acts as controller for about 3 minutes,
which can be found in my attachment.

This message was sent by Atlassian JIRA

View raw message