Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 65F73200D48 for ; Wed, 29 Nov 2017 15:04:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 64F1B160C04; Wed, 29 Nov 2017 14:04:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 85079160BF6 for ; Wed, 29 Nov 2017 15:04:04 +0100 (CET) Received: (qmail 44087 invoked by uid 500); 29 Nov 2017 14:04:03 -0000 Mailing-List: contact jira-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@kafka.apache.org Delivered-To: mailing list jira@kafka.apache.org Received: (qmail 44071 invoked by uid 99); 29 Nov 2017 14:04:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Nov 2017 14:04:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id DFFE61806C0 for ; Wed, 29 Nov 2017 14:04:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 4h3vLbDUJEut for ; Wed, 29 Nov 2017 14:04:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id F31BE5F4AC for ; Wed, 29 Nov 2017 14:04:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5F75BE0295 for ; Wed, 29 Nov 2017 14:04:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 11FAB21056 for ; Wed, 29 Nov 2017 14:04:00 +0000 (UTC) Date: Wed, 29 Nov 2017 14:04:00 +0000 (UTC) From: "Manikumar (JIRA)" To: jira@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-6281) Kafka JavaAPI Producer failed with NotLeaderForPartitionException MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 29 Nov 2017 14:04:05 -0000 [ https://issues.apache.org/jira/browse/KAFKA-6281?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1627= 0767#comment-16270767 ]=20 Manikumar commented on KAFKA-6281: ---------------------------------- [~anilkumar763@gmail.com] 2mins is high value. increasing further may delay= the controller, leader switchover time on failure scenarios. If the ZK e= xpiration is frequent, then you may want to find out the root-cause. may b= e due to bad network connection, GC etc.. > Kafka JavaAPI Producer failed with NotLeaderForPartitionException > ----------------------------------------------------------------- > > Key: KAFKA-6281 > URL: https://issues.apache.org/jira/browse/KAFKA-6281 > Project: Kafka > Issue Type: Bug > Reporter: Anil > Attachments: server1-controller.log, server2-controller.log > > > We are running Kafka (vesion kafka_2.11-0.10.1.0) in a 2 node cluster. We= have 2 producers (Java API) acting on different topics. Each topic has sin= gle partition. The topic where we had this issue, has one consumer running.= This set up has been running fine for 3 months, and we saw this issue. All= the suggested cases/solutions for this issue in other forums don't seem to= apply for my scenario. > Exception at producer; > {code} > -2017-11-25T17:40:33,035 [kafka-producer-network-thread | producer-1] ERR= OR client.producer.BingLogProducerCallback - Encountered exception in sendi= ng message ; > org.apache.kafka.common.errors.NotLeaderForPartitionExceptio= n: This server is not the leader for that topic-partition. > {code} > We haven't enabled retries for the messages, because this is transactiona= l data and we want to maintain the order. > Producer config: > {code} > bootstrap.servers : server1ip:9092 > acks :all > retries : 0 > linger.ms :0 > buffer.memory :10240000 > max.request.size :1024000 > key.serializer : org.apache.kafka.common.serialization.StringSerializer > value.serializer : org.apache.kafka.common.serialization.StringSerializer > {code} > We are connecting to server1 at both producer and consumer. The controlle= r log at server2 indicates there is some shutdown happened at during sameti= me, but I dont understand why this happened. > {color:red}[2017-11-25 17:31:44,776] DEBUG [Controller 2]: topics not in = preferred replica Map() (kafka.controller.KafkaController) [2017-11-25 17:3= 1:44,776] TRACE [Controller 2]: leader imbalance ratio for broker 2 is 0.00= 0000 (kafka.controller.KafkaController) [2017-11-25 17:31:44,776] DEBUG [Co= ntroller 2]: topics not in preferred replica Map() (kafka.controller.KafkaC= ontroller) [2017-11-25 17:31:44,776] TRACE [Controller 2]: leader imbalance= ratio for broker 1 is 0.000000 (kafka.controller.KafkaController) [2017-11= -25 17:34:18,314] INFO [SessionExpirationListener on 2], ZK expired; shut d= own all controller components and try to re-elect (kafka.controller.KafkaCo= ntroller$SessionExpirationListener) [2017-11-25 17:34:18,317] DEBUG [Contro= ller 2]: Controller resigning, broker id 2 (kafka.controller.KafkaControlle= r) [2017-11-25 17:34:18,317] DEBUG [Controller 2]: De-registering IsrChange= NotificationListener (kafka.controller.KafkaController) [2017-11-25 17:34:1= 8,317] INFO [delete-topics-thread-2], Shutting down (kafka.controller.Topic= DeletionManager$DeleteTopicsThread) [2017-11-25 17:34:18,317] INFO [delete-= topics-thread-2], Stopped (kafka.controller.TopicDeletionManager$DeleteTopi= csThread) [2017-11-25 17:34:18,318] INFO [delete-topics-thread-2], Shutdown= completed (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017= -11-25 17:34:18,318] INFO [Partition state machine on Controller 2]: Stoppe= d partition state machine (kafka.controller.PartitionStateMachine) [2017-11= -25 17:34:18,318] INFO [Replica state machine on controller 2]: Stopped rep= lica state machine (kafka.controller.ReplicaStateMachine) [2017-11-25 17:34= :18,318] INFO [Controller-2-to-broker-2-send-thread], Shutting down (kafka.= controller.RequestSendThread) [2017-11-25 17:34:18,318] INFO [Controller-2-= to-broker-2-send-thread], Stopped (kafka.controller.RequestSendThread) [201= 7-11-25 17:34:18,319] INFO [Controller-2-to-broker-2-send-thread], Shutdown= completed (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] I= NFO [Controller-2-to-broker-1-send-thread], Shutting down (kafka.controller= .RequestSendThread) [2017-11-25 17:34:18,319] INFO [Controller-2-to-broker-= 1-send-thread], Stopped (kafka.controller.RequestSendThread) [2017-11-25 17= :34:18,319] INFO [Controller-2-to-broker-1-send-thread], Shutdown completed= (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO [Contr= oller 2]: Broker 2 resigned as the controller (kafka.controller.KafkaContro= ller) [2017-11-25 17:34:18,353] DEBUG [IsrChangeNotificationListener] Fired= !!! (kafka.controller.IsrChangeNotificationListener) [2017-11-25 17:34:18,3= 53] DEBUG [IsrChangeNotificationListener] Fired!!! (kafka.controller.IsrCha= ngeNotificationListener) [2017-11-25 17:34:18,354] INFO [BrokerChangeListen= er on Controller 2]: Broker change listener fired for path /brokers/ids wit= h children 1,2 (kafka.controller.ReplicaStateMachine$BrokerChangeListener) = [2017-11-25 17:34:18,355] DEBUG [DeleteTopicsListener on 2]: Delete topics = listener fired for topics to be deleted (kafka.controller.PartitionStateMac= hine$DeleteTopicsListener) [2017-11-25 17:34:18,362] INFO [AddPartitionsLis= tener on 2]: Partition modification triggered {"version":1,"partitions":{"0= ":[1]}} for path /brokers/topics/ESQ (kafka.controller.PartitionStateMachin= e$PartitionModificationsListener) [2017-11-25 17:34:18,368] INFO [AddPartit= ionsListener on 2]: Partition modification triggered {"version":1,"partitio= ns":{"0":[1]}} for path /brokers/topics/Test1 (kafka.controller.PartitionSt= ateMachine$PartitionModificationsListener) [2017-11-25 17:34:18,369] INFO [= AddPartitionsListener on 2]: Partition modification triggered {"version":1,= "partitions":{"0":[2]}} for path /brokers/topics/ImageQ (kafka.controller.P= artitionStateMachine$PartitionModificationsListener) [2017-11-25 17:34:18,3= 74] INFO [AddPartitionsListener on 2]: Partition modification triggered {"v= ersion":1,"partitions":{"8":[1,2],"4":[1,2],"9":[2,1],"5":[2,1],"6":[1,2],"= 1":[2,1],"0":[1,2],"2":[1,2],"7":[2,1],"3":[2,1]}} for path /brokers/topics= /NMS_NotifyQ (kafka.controller.PartitionStateMachine$PartitionModifications= Listener) [2017-11-25 17:34:18,375] INFO [AddPartitionsListener on 2]: Part= ition modification triggered {"version":1,"partitions":{"0":[1]}} for path = /brokers/topics/TempBinLogReqQ @{color} -- This message was sent by Atlassian JIRA (v6.4.14#64029)