Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7FC68200C68 for ; Wed, 3 May 2017 19:23:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7E687160BBC; Wed, 3 May 2017 17:23:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C3BD7160BA1 for ; Wed, 3 May 2017 19:23:09 +0200 (CEST) Received: (qmail 3687 invoked by uid 500); 3 May 2017 17:23:08 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 3667 invoked by uid 99); 3 May 2017 17:23:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 May 2017 17:23:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 506271A01F7 for ; Wed, 3 May 2017 17:23:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 0XRvPUK69jqk for ; Wed, 3 May 2017 17:23:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B93AE5FDE9 for ; Wed, 3 May 2017 17:23:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1E0DDE0D85 for ; Wed, 3 May 2017 17:23:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7503821DF5 for ; Wed, 3 May 2017 17:23:04 +0000 (UTC) Date: Wed, 3 May 2017 17:23:04 +0000 (UTC) From: "Colin P. McCabe (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-5004) poll() timeout not enforced when connecting to 0.10.0 broker MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 03 May 2017 17:23:10 -0000 [ https://issues.apache.org/jira/browse/KAFKA-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995257#comment-15995257 ] Colin P. McCabe commented on KAFKA-5004: ---------------------------------------- Thanks for filing this, [~mjsax]. I think the severity is mitigated somewhat by the fact that there has to be a client-side bug (polling thread dies) to trigger the bad behavior. bq. IMHO, a "clean" solution would be, to disable the heartbeat thread if the client connects to 0.10.0 broker and sends heartbeats on poll() as 0.10.0 consumer does. Not sure, how complex this would be to do though. I think this would be a bit risky since we'd be adding code that only ever gets used in a very obscure error path when talking to 0.10.0 brokers. It's not likely to be well-tested. bq. [~cmccabe] had the idea to set a "flag" on the heartbeat thread each time poll() is called, and let the heartbeat thread stop if max.poll.interval.ms passed and flag got not "renewed". Yeah, this might be a good option. > poll() timeout not enforced when connecting to 0.10.0 broker > ------------------------------------------------------------ > > Key: KAFKA-5004 > URL: https://issues.apache.org/jira/browse/KAFKA-5004 > Project: Kafka > Issue Type: Bug > Components: clients, consumer > Affects Versions: 0.10.2.0 > Reporter: Matthias J. Sax > > In 0.10.1, heartbeat thread and new poll timeout {{max.poll.interval.ms}} got introduced via KIP-62. In 0.10.2, we added client-broker backward compatibility. > Now, if a 0.10.2 client connects to a 0.10.0 broker, the broker only understand the heartbeat timeout but not the poll timeout, while the client is still using the heartbeat background threat. Thus, the new client config {{max.poll.interval.ms}} is ignored. > In the worst case, the polling threat might die while the heartbeat thread is still up. Thus, the broker would not timeout the client and no rebalance would be triggered while at the same time the client is effectively dead not making any progress in its assigned partitions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)