kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manikumar (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (KAFKA-2329) Consumers balance fails when multiple consumers are started simultaneously.
Date Wed, 10 Jan 2018 14:34:01 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Manikumar resolved KAFKA-2329.
    Resolution: Auto Closed

Closing inactive issue. The old consumer is no longer supported, please upgrade to the Java
consumer whenever possible.

> Consumers balance fails when multiple consumers are started simultaneously.
> ---------------------------------------------------------------------------
>                 Key: KAFKA-2329
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2329
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions:,
>            Reporter: Ze'ev Eli Klapow
>            Assignee: Ze'ev Eli Klapow
>              Labels: consumer, patch
>         Attachments: zookeeper-consumer-connector-epoch-node.patch
> During consumer startup a race condition can occur if multiple consumers are started
(nearly) simultaneously. 
> If a second consumer is started while the first consumer is in the middle of {{zkClient.subscribeChildChanges}}
the first consumer will never see the registration of the second consumer, because the consumer
registration node for the second consumer will be unwatched, and no new child will be registered
later. This causes the first consumer to own all partitions, and then never release ownership
causing the second consumer to fail rebalancing.
> The attached patch solves this by using an "epoch" node which all consumers watch and
update to trigger  a rebalance. When a rebalance is triggered we check the consumer registrations
against a cached state, to avoid unnecessary rebalances. For safety, we also periodically
check the consumer registrations and rebalance. We have been using this patch in production
at HubSpot for a while and it has eliminated all rebalance issues.

This message was sent by Atlassian JIRA

View raw message