kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexandre Vermeerbergen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4443) Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover
Date Thu, 01 Dec 2016 15:23:59 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712243#comment-15712243

Alexandre Vermeerbergen commented on KAFKA-4443:


Could this fix be back-ported to Kafka please, or better, as a patch for ?
We had repeated occurrences in the past weeks with Kafka

Best regards,

> Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest during failover
> -----------------------------------------------------------------------------------------
>                 Key: KAFKA-4443
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4443
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions:
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>              Labels: reliability
>             Fix For:
> Currently in onControllerFailover(), controller will startup replicaStatemachine and
partitionStateMachine before invoking sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq).
However, if a broker starts right after controller election, the LeaderAndIsrRequest sent
to follower partitions on this broker will all be ignored because broker doesn't know the
leaders are alive. 
> To fix this problem, in onControllerFailover(), controller should send UpdateMetadataRequest
to brokers after initializeControllerContext() but before it starts replicaStatemachine and
partitionStateMachine. The first MetadatUpdateRequest will include list of live broker. Although
it will not include partition leader information, it is OK because we will always send MetadataUpdateRequest
again when we send LeaderAndIsrRequest during replicaStateMachine.startup() and partitionStateMachine.startup().

This message was sent by Atlassian JIRA

View raw message