kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4385) producer is sending too many unnecessary meta data request if the meta data for a topic is not available and "auto.create.topics.enable" =false
Date Sun, 06 Nov 2016 16:16:58 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15642021#comment-15642021

Jun Yao commented on KAFKA-4385:


As said in this ticket title, the unnecessary overhead is for the case when broker side has
config of when "auto.create.topics.enable=false", 
in that case, as I have seen, when the producer got some unexpected msgs which the topic is
not there in the brokers, the producers are sending thousands of metadata request to brokers
(with default metadata.fetch.timeout.ms=60000 and usually the metadata request rt is around
20-30ms).  for some amount of unexpected msgs, the impact is amplified like 2000 times or
3000 times and slowed down the cluster.

In my pr, I added one more config 'metadata.fetch.max.count', which default to Integer. MAX_VALUE,
then the loop is still there, which means all the behaviors are not changed.  so for common
"auto.create.topics.enable=true" cases it's still the same. 

while the broker side has chosen "auto.create.topics.enable=false",  because we know that
there is no topic creation case,  then from my perspective keep looping metadata request is
not necessary, so we can config 'metadata.fetch.max.count=1'. 

in summary, by default everything is the same and in some cases we could provide users an
option to reduce too much overhead. 

> producer is sending too many unnecessary meta data request if the meta data for a topic
is not available and "auto.create.topics.enable" =false
> -----------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: KAFKA-4385
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4385
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jun Yao
> All current kafka-client producer implementation (<=,
> When sending a msg to a topic, it will first check if meta data for this topic is available
or not, 
> when not available, it will set "metadata.requestUpdate()" and wait for meta data from
> The thing is inside "org.apache.kafka.clients.Metadata.awaitUpdate()", it's already doing
a "while (this.version <= lastVersion)" loop waiting for new version response, 
> So the loop inside "org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata()
is not needed, 
> When "auto.create.topics.enable" is false, sending msgs to a non-exist topic will trigger
too many meta requests, everytime a metadata response is returned, because it does not contain
the metadata for the topic, it's going to try again until TimeoutException is thrown; 
> This is a waste and sometimes causes too much overhead when unexpected msgs are arrived.

This message was sent by Atlassian JIRA

View raw message