kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edoardo Comar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5200) Deleting topic when one broker is down will prevent topic to be re-creatable
Date Mon, 15 May 2017 14:23:04 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010601#comment-16010601

Edoardo Comar commented on KAFKA-5200:

Thanks [~huxi_2b] unfortunately such steps would imply significant downtime which is not acceptable
to us.

We actually tested a much less intrusive way to handle this occurrence, 
i.e. delete the zookeeper info about the topic while the cluster is still running (minus the
dead broker of course)
and then force *only the controller broker* to restart.

Even if this is less intrusive, it still means that for a short-ish time two brokers are down.
With replication-factor 3 and min.insync.2 this implies an outage for some clients 
which remains unacceptable.

> Deleting topic when one broker is down will prevent topic to be re-creatable
> ----------------------------------------------------------------------------
>                 Key: KAFKA-5200
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5200
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Edoardo Comar
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that (unknowingly)
had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left stuck -
no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - specifically
because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool

This message was sent by Atlassian JIRA

View raw message