kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4442) Controller should grab lock when it is being initialized to avoid race condition
Date Fri, 25 Nov 2016 04:09:59 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15694773#comment-15694773
] 

ASF GitHub Bot commented on KAFKA-4442:
---------------------------------------

GitHub user lindong28 opened a pull request:

    https://github.com/apache/kafka/pull/2167

    KAFKA-4442; Controller should grab lock when it is being initialized to avoid race condition

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lindong28/kafka KAFKA-4442

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2167.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2167
    
----
commit 16825e60963844ab0729bf290cfc9e6cee79932f
Author: Dong Lin <lindong28@gmail.com>
Date:   2016-11-25T04:07:09Z

    KAFKA-4442; Controller should grab lock when it is being initialized to avoid race condition

----


> Controller should grab lock when it is being initialized to avoid race condition
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-4442
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4442
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>
> Currently controller will register broker change listener before sending send LeaderAndIsrRequests
to live replicas. The call path looks like this:
> - onControllerFailover()
>   - partitionStateMachine.startup()
>     - triggerOnlinePartitionStateChange()
>       - handleStateChange(partition, OnlinePartition)
>         - electLeaderForPartition(partition)
>           - determines live replicas for this partition (step a)
>           - add partition to controllerContext.partitionLeadershipInfo. (step b)
>           - send LeaderAndIsrRequest to those live replics for this partition
> However, if a broker registers itself in zookeeper in between step (a) and step (b),
the onBrokerStartup() will not send LeaderAndIsrRequest to this broker for this partition
because the partition is not found in controllerContext.partitionLeadershipInfo. Yet onControllerFailover()
will not send LeaderAndIsrRequest to this broker for this partition either before the broker
is not considered live in step (a).
> The root cause is that onBrokerStartup() should only be executed after controller has
finished onControllerFailover() and initialized its state. Therefore controller should grab
the lock controllerContext.controllerLock during onControllerFailover().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message