Return-Path: X-Original-To: apmail-kafka-users-archive@www.apache.org Delivered-To: apmail-kafka-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 250A1F20A for ; Wed, 21 Aug 2013 15:00:05 +0000 (UTC) Received: (qmail 68006 invoked by uid 500); 21 Aug 2013 15:00:04 -0000 Delivered-To: apmail-kafka-users-archive@kafka.apache.org Received: (qmail 67941 invoked by uid 500); 21 Aug 2013 15:00:04 -0000 Mailing-List: contact users-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@kafka.apache.org Delivered-To: mailing list users@kafka.apache.org Received: (qmail 67919 invoked by uid 99); 21 Aug 2013 15:00:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Aug 2013 15:00:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of libo.yu@citi.com designates 67.231.145.106 as permitted sender) Received: from [67.231.145.106] (HELO mx0a-00123c01.pphosted.com) (67.231.145.106) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Aug 2013 14:59:55 +0000 Received: from pps.filterd (m0008096 [127.0.0.1]) by mx0a-00123c02.pphosted.com (8.14.5/8.14.5) with SMTP id r7LEt6bQ009101 for ; Wed, 21 Aug 2013 14:59:35 GMT Received: from mail.citigroup.com ([192.193.193.16]) by mx0a-00123c02.pphosted.com with ESMTP id 1e5gn9jwwm-1 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 21 Aug 2013 14:59:35 +0000 Received: from imbhub-ru34.nam.nsroot.net (namdlpdimpmw05.nam.nsroot.net [169.193.148.105]) by smtpinbound.citigroup.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id r7LExDGZ021494 for ; Wed, 21 Aug 2013 14:59:33 GMT Received: from exnjiht02.nam.nsroot.net (EXNJIHT02.nam.nsroot.net [150.110.165.228]) by imbhub-ru34.nam.nsroot.net (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id r7LEwqQa030663 for ; Wed, 21 Aug 2013 14:59:32 GMT Received: from exnjaht02.nam.nsroot.net (150.110.107.29) by exnjiht02.nam.nsroot.net (150.110.165.228) with Microsoft SMTP Server (TLS) id 8.3.264.0; Wed, 21 Aug 2013 10:59:11 -0400 Received: from EXTXHT07.nam.nsroot.net (169.177.87.18) by exnjaht02.nam.nsroot.net (150.110.107.29) with Microsoft SMTP Server (TLS) id 8.3.264.0; Wed, 21 Aug 2013 10:59:07 -0400 Received: from EXTXMB19.nam.nsroot.net ([169.254.4.20]) by EXTXHT07.nam.nsroot.net ([169.177.87.18]) with mapi id 14.02.0328.009; Wed, 21 Aug 2013 09:59:06 -0500 From: "Yu, Libo " To: "'users@kafka.apache.org'" Subject: RE: broker never comes back to ISR Thread-Topic: broker never comes back to ISR Thread-Index: Ac6efPj+L4iBpjskQGSQh2N2EN2IAwAAcz0Q Date: Wed, 21 Aug 2013 14:59:05 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [169.177.87.240] Content-Type: multipart/alternative; boundary="_000_FF142F6B499AE34CAED4D263F6CA32901D34E2A3EXTXMB19namnsro_" MIME-Version: 1.0 X-WiganSS: 01000000010018exnjaht02.nam.nsroot.net ID0042 X-CFilter-Loop: Reflected X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794,1.0.431,0.0.0000 definitions=2013-08-21_06:2013-08-21,2013-08-21,1970-01-01 signatures=0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_FF142F6B499AE34CAED4D263F6CA32901D34E2A3EXTXMB19namnsro_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I checked the log of normal restart. The replication manager should start t= o handle leader and isr request after the server is up. What may stop it from doing = that? Is it because of missing mx4j-tools.jar? Regards, Libo From: Yu, Libo [ICG-IT] Sent: Wednesday, August 21, 2013 10:51 AM To: 'users@kafka.apache.org' Subject: broker never comes back to ISR Hi team, We have three kafka brokers in a production cluster. We use replication fac= tor 3 for all topics. We notice quite frequently one broker is not in isr. Sometimes after it is = restarted, it will go back to isr. Sometimes even after it is restarted, it will not go b= ack to isr. In today's case, after a broker is restarted, this is what we found from th= e log: [2013-08-21 08:22:55,524] INFO [Kafka Server 2], started (kafka.server.Kafk= aServer) [2013-08-21 08:25:06,621] INFO Closing socket connection to /xxx.xx.xx.xx. = (kafka.network.Processor) [2013-08-21 08:25:06,716] INFO Closing socket connection to / xxx.xx.xx.xx.= (kafka.network.Processor) [2013-08-21 08:27:19,824] INFO Closing socket connection to / xxx.xx.xx.xx.= (kafka.network.Processor) [2013-08-21 08:28:16,711] INFO Closing socket connection to / xxx.xx.xx.xx.= (kafka.network.Processor) [2013-08-21 08:28:17,978] INFO Closing socket connection to / xxx.xx.xx.xx.= (kafka.network.Processor) ... Numerous "Closing socket connection" and nothing else. Any guidance will be appreciated. Regards, Libo --_000_FF142F6B499AE34CAED4D263F6CA32901D34E2A3EXTXMB19namnsro_--