kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Rampke (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5200) If a replicated topic is deleted with one broker down, it can't be recreated
Date Fri, 08 Dec 2017 15:22:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283713#comment-16283713

Matthias Rampke commented on KAFKA-5200:

To expand on the workaround [~huxi_2b] proposed:

If you cannot resurrect the dead broker itself, you can make Kafka act as if you did

#  Start a new broker, but then shut it down quickly (before any newly created partitions
are assigned to it).
# in meta.properties, change the broker ID to the one of the dead broker
# Start it
# watch its logs – it will pick up the pending deletions and go through, or you can reassign
at this point
# stop it again

This may be problematic if you have a lot of partition creation going on, because you need
to avoid getting any partitions assigned to this broker while it's running, but otherwise
this works without downtime.

> If a replicated topic is deleted with one broker down, it can't be recreated
> ----------------------------------------------------------------------------
>                 Key: KAFKA-5200
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5200
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Edoardo Comar
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that (unknowingly)
had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left stuck -
no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - specifically
because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool

This message was sent by Atlassian JIRA

View raw message