kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ismael Juma (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (KAFKA-4871) Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP changes
Date Tue, 21 Nov 2017 15:04:00 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ismael Juma resolved KAFKA-4871.
--------------------------------
    Resolution: Duplicate

Duplicate of KAFKA-5473.

> Kafka doesn't respect TTL on Zookeeper hostname - crash if zookeeper IP changes
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-4871
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4871
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.2.0
>            Reporter: Stephane Maarek
>
> I had a Zookeeper cluster that automatically obtains hostname so that they remain constant
over time. I deleted my 3 zookeeper machines and new machines came back online, with the same
hostname, and they updated their CNAME
> Kafka then failed and couldn't reconnect to Zookeeper as it didn't try to resolve the
IP of Zookeeper again. See log below:
> [2017-03-09 05:49:57,302] INFO Client will use GSSAPI as SASL mechanism. (org.apache.zookeeper.client.ZooKeeperSaslClient)
> [2017-03-09 05:49:57,302] INFO Opening socket connection to server zookeeper-3.example.com/10.12.79.43:2181.
Will attempt to SASL-authenticate using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)
> [ec2-user]$ dig +short zookeeper-3.example.com
> 10.12.79.36
> As you can see even though the machine is capable of finding the new hostname, Kafka
somehow didn't respect the TTL (was set to 60 seconds) and didn't get the new IP. I feel that
on failed Zookeeper connection, Kafka should at least try to resolve the new Zookeeper IP.
That allows Kafka to keep up with Zookeeper changes over time
> What do you think? Is that expected behaviour or a bug?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message