Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7682A2004F2 for ; Sat, 26 Aug 2017 22:32:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7310F164C30; Sat, 26 Aug 2017 20:32:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BEDB4164C2F for ; Sat, 26 Aug 2017 22:32:07 +0200 (CEST) Received: (qmail 48605 invoked by uid 500); 26 Aug 2017 20:32:05 -0000 Mailing-List: contact jira-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@kafka.apache.org Delivered-To: mailing list jira@kafka.apache.org Received: (qmail 48594 invoked by uid 99); 26 Aug 2017 20:32:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Aug 2017 20:32:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2F4DDC69B0 for ; Sat, 26 Aug 2017 20:32:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tzCTlhBguopk for ; Sat, 26 Aug 2017 20:32:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id A25E860E39 for ; Sat, 26 Aug 2017 20:32:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D2F2DE0041 for ; Sat, 26 Aug 2017 20:32:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7A6402537B for ; Sat, 26 Aug 2017 20:32:00 +0000 (UTC) Date: Sat, 26 Aug 2017 20:32:00 +0000 (UTC) From: "Aman Choudhary (JIRA)" To: jira@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (KAFKA-3916) Connection from controller to broker disconnects MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 26 Aug 2017 20:32:09 -0000 [ https://issues.apache.org/jira/browse/KAFKA-3916?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1614= 2922#comment-16142922 ]=20 Aman Choudhary edited comment on KAFKA-3916 at 8/26/17 8:31 PM: ---------------------------------------------------------------- Hi, I am using Kafka version 0.10.0.2 in the production environment. I am faci= ng issues in my kafka broker and consumer machines which is very similar to= issue described here. Controller logs are very similar to the one described above: {panel:title=3DMy title} |WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-2:9092 (id: 5 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-3:9092 (id: 4 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_= protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3D= host-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{por= t=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{id=3D2,en= d_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}],rack=3D= null},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-5:9092 (id: 2 rack: null). Reconnect= ing to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3D= host-2,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,en= d_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3D= null},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-4:9092 (id: 1 rack: null). Reconnect= ing to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3D= host-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D4,en= d_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_type=3D0}],rack=3D= null},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). Reconnect= ing to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,h= ost=3Dhost-2,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points= =3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,h= ost=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points= =3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-2:9092 (id: 5 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,010] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,h= ost=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points= =3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-4:9092 (id: 1 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-4:9092 (id: 1 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_= protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3D= host-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points=3D[{por= t=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,en= d_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3D= null},{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-2:9092 (id: 5 rack: null). Reconnect= ing to broker. WARN [2017-08-26 23:44:28,289] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3D= host-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,en= d_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3D= null},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). Reconnect= ing to broker. WARN [2017-08-26 23:44:28,289] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_= protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3D= host-6,security_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{por= t=3D9092,host=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,en= d_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3D= null},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). Reconnect= ing to broker. WARN [2017-08-27 00:10:59,050] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-6:9092 (id: 6 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,050] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,h= ost=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points= =3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). R= econnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,h= ost=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points= =3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-1:9092 (id: 3 rack: null). R= econnecting to broker. WARN [2017-08-27 00:10:59,060] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-2:9092 (id: 5 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,h= ost=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points= =3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-4:9092 (id: 1 rack: null). R= econnecting to broker. {panel} Consumer error log: ||Heading 1||Heading 2|| |2017-08-26 23:47:54,738 WARN [kafka-producer-network-thread | producer-8] = o.a.k.c.p.i.Sender [kafka-producer-network-thread | (consumer group)] Got e= rror produce response with correlation id 10322 on topic-partition topic-0,= retrying (9 attempts left). Error: NETWORK_EXCEPTION|Col A2| I am having 5000 topics right now with a retention period of 1 hour. The ma= ximum size of data during peak load is 3-4 GB in a machine and I am having = 6 kafka broker machines of 6 core and 16 GB RAM. Can someone please point out if there's something wrong in my approach? Do = I need to update to latest version? was (Author: void.aman93): Hi, I am using Kafka version 0.10.0.2 in the production environment. I am faci= ng issues in my kafka broker and consumer machines which is very similar to= issue described here. Controller logs are very similar to the one described above: =20 ||Heading 1||Heading 2|| |WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-2:9092 (id: 5 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-3:9092 (id: 4 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,204] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_= protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3D= host-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{por= t=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{id=3D2,en= d_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}],rack=3D= null},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-5:9092 (id: 2 rack: null). Reconnect= ing to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3D= host-2,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,en= d_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3D= null},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-4:9092 (id: 1 rack: null). Reconnect= ing to broker. WARN [2017-08-26 19:19:27,205] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-1,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,4,5],zk_version=3D0,replicas= =3D[3,4,5]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3D= host-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D4,en= d_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_type=3D0}],rack=3D= null},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). Reconnect= ing to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,h= ost=3Dhost-2,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points= =3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,h= ost=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points= =3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-2:9092 (id: 5 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,010] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,h= ost=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points= =3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). R= econnecting to broker. WARN [2017-08-26 20:46:41,009] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2604144,partition=3D0,contro= ller_epoch=3D34,leader=3D1,leader_epoch=3D0,isr=3D[1,2,3],zk_version=3D0,re= plicas=3D[1,2,3]}],live_leaders=3D[{id=3D1,host=3Dhost-4,port=3D9092}]} to = broker host-4:9092 (id: 1 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-4:9092 (id: 1 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]}],live_leaders=3D[{id=3D3,host=3Dhost-1,port=3D9092}]} to broker= host-1:9092 (id: 3 rack: null). Reconnecting to broker. WARN [2017-08-26 23:44:28,288] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_= protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3D= host-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points=3D[{por= t=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{id=3D6,en= d_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}],rack=3D= null},{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-2:9092 (id: 5 rack: null). Reconnect= ing to broker. WARN [2017-08-26 23:44:28,289] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_= protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,host=3D= host-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{por= t=3D9092,host=3Dhost-1,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,en= d_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3D= null},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). Reconnect= ing to broker. WARN [2017-08-26 23:44:28,289] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-3,partition=3D0,controller_epo= ch=3D34,leader=3D3,leader_epoch=3D0,isr=3D[3,1,2],zk_version=3D0,replicas= =3D[3,1,2]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,leader=3D-= 2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brokers=3D= [{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D= 0}],rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_= protocol_type=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3D= host-6,security_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{por= t=3D9092,host=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,en= d_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3D= null},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_ty= pe=3D0}],rack=3Dnull}]} to broker host-6:9092 (id: 6 rack: null). Reconnect= ing to broker. WARN [2017-08-27 00:10:59,050] [Controller-6-to-broker-6-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-6-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-6:9092 (id: 6 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,050] [Controller-6-to-broker-2-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-2-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-5:9092 (id: 2 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-4-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-4-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,h= ost=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points= =3D[{port=3D9092,host=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-3:9092 (id: 4 rack: null). R= econnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-3-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-3-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D2,end_points=3D[{port=3D9092,host=3Dhost-5,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,h= ost=3Dhost-3,security_protocol_type=3D0}],rack=3Dnull},{id=3D5,end_points= =3D[{port=3D9092,host=3Dhost-2,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D1,end_points=3D[{port=3D9092,host=3Dhost-4,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-1:9092 (id: 3 rack: null). R= econnecting to broker. WARN [2017-08-27 00:10:59,060] [Controller-6-to-broker-5-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-5-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]}],live_leaders=3D[{id=3D2,host=3Dhost-5,port=3D9092}]} to = broker host-2:9092 (id: 5 rack: null). Reconnecting to broker. WARN [2017-08-27 00:10:59,059] [Controller-6-to-broker-1-send-thread][] ka= fka.controller.RequestSendThread - [Controller-6-to-broker-1-send-thread], = Controller 6 epoch 34 fails to send request {controller_id=3D6,controller_e= poch=3D34,partition_states=3D[{topic=3Dtopic-2.2606546,partition=3D0,contro= ller_epoch=3D34,leader=3D2,leader_epoch=3D0,isr=3D[2,5,6],zk_version=3D0,re= plicas=3D[2,5,6]},{topic=3Dtopic-4,partition=3D0,controller_epoch=3D34,lead= er=3D-2,leader_epoch=3D0,isr=3D[],zk_version=3D0,replicas=3D[0]}],live_brok= ers=3D[{id=3D5,end_points=3D[{port=3D9092,host=3Dhost-2,security_protocol_t= ype=3D0}],rack=3Dnull},{id=3D6,end_points=3D[{port=3D9092,host=3Dhost-6,sec= urity_protocol_type=3D0}],rack=3Dnull},{id=3D2,end_points=3D[{port=3D9092,h= ost=3Dhost-5,security_protocol_type=3D0}],rack=3Dnull},{id=3D1,end_points= =3D[{port=3D9092,host=3Dhost-4,security_protocol_type=3D0}],rack=3Dnull},{i= d=3D3,end_points=3D[{port=3D9092,host=3Dhost-1,security_protocol_type=3D0}]= ,rack=3Dnull},{id=3D4,end_points=3D[{port=3D9092,host=3Dhost-3,security_pro= tocol_type=3D0}],rack=3Dnull}]} to broker host-4:9092 (id: 1 rack: null). R= econnecting to broker. |Col A2| Consumer error log: ||Heading 1||Heading 2|| |2017-08-26 23:47:54,738 WARN [kafka-producer-network-thread | producer-8] = o.a.k.c.p.i.Sender [kafka-producer-network-thread | (consumer group)] Got e= rror produce response with correlation id 10322 on topic-partition topic-0,= retrying (9 attempts left). Error: NETWORK_EXCEPTION|Col A2| I am having 5000 topics right now with a retention period of 1 hour. The ma= ximum size of data during peak load is 3-4 GB in a machine and I am having = 6 kafka broker machines of 6 core and 16 GB RAM. Can someone please point out if there's something wrong in my approach? Do = I need to update to latest version? > Connection from controller to broker disconnects > ------------------------------------------------ > > Key: KAFKA-3916 > URL: https://issues.apache.org/jira/browse/KAFKA-3916 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.9.0.1 > Reporter: Dave Powell > Assignee: Jason Gustafson > Fix For: 0.10.1.0 > > > We recently upgraded from 0.8.2.1 to 0.9.0.1. Since then, several times p= er day, the controllers in our clusters have their connection to all broker= s disconnected, and then successfully reconnected a few hundred ms later. E= ach time this occurs we see a brief spike in our 99th percentile produce an= d consume times, reaching several hundred ms. > Here is an example of what we're seeing in the controller.log: > {code} > [2016-06-28 14:15:35,416] WARN [Controller-151-to-broker-160-send-thread]= , Controller 151 epoch 106 fails to send request {=E2=80=A6} to broker Node= (160, broker.160.hostname, 9092). Reconnecting to broker. (kafka.controller= .RequestSendThread) > java.io.IOException: Connection to 160 was disconnected before the respon= se was read > at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndR= eceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87= ) > at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndR= eceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84= ) > at scala.Option.foreach(Option.scala:236) > at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndR= eceive$extension$1.apply(NetworkClientBlockingOps.scala:84) > at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndR= eceive$extension$1.apply(NetworkClientBlockingOps.scala:80) > at kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientB= lockingOps.scala:129) > at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClien= tBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139) > at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$e= xtension(NetworkClientBlockingOps.scala:80) > at kafka.controller.RequestSendThread.liftedTree1$1(ControllerCha= nnelManager.scala:180) > at kafka.controller.RequestSendThread.doWork(ControllerChannelMan= ager.scala:171) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63= ) > ... one each for all brokers (including the controller) ... > [2016-06-28 14:15:35,721] INFO [Controller-151-to-broker-160-send-thread= ], Controller 151 connected to Node(160, broker.160.hostname, 9092) for sen= ding state change requests (kafka.controller.RequestSendThread) > =E2=80=A6 one each for all brokers (including the controller) ... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)