kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Davis (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-3893) Kafka Borker ID disappears from /borkers/ids
Date Sat, 02 Jul 2016 19:37:10 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360306#comment-15360306
] 

Peter Davis edited comment on KAFKA-3893 at 7/2/16 7:36 PM:
------------------------------------------------------------

Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a
zookeeper connection is lost, it seems any other changes in the cluster during the loss (which
would be expected if an outage affects multiple brokers) are not recognized when it reconnects.
 We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not equal to that in
zookeeper", and the broker never recovers until manually restarted. 

For us this happened almost daily when running on a cluster virtual machines that would get
paused for a few seconds every night for a snapshot backup.  We disabled the backup but it's
very concerning that Kafka won't recover after a pause!

Seen with 0.9 and 0.10. 


was (Author: davispw):
Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a
zookeeper connection is lost, any other changes in the cluster during the loss are not recognized
when it reconnects.  We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not
equal to that in zookeeper", and the broker never recovers. 

For us this happened almost daily when running on a cluster virtual machines that would get
paused for a few seconds every night for a snapshot backup.  We disabled the backup but it's
very concerning that Kafka won't recover after a pause!

> Kafka Borker ID disappears from /borkers/ids
> --------------------------------------------
>
>                 Key: KAFKA-3893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3893
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: chaitra
>            Priority: Critical
>
> Kafka version used : 0.8.2.1 
> Zookeeper version: 3.4.6
> We have scenario where kafka 's broker in  zookeeper path /brokers/ids just disappears.
> We see the zookeeper connection active and no network issue.
> The zookeeper conection timeout is set to 6000ms in server.properties
> Hence Kafka not participating in cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message