From dev-return-107889-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Thu Sep 26 23:42:03 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id A42B01804BB for ; Fri, 27 Sep 2019 01:42:03 +0200 (CEST) Received: (qmail 13018 invoked by uid 500); 26 Sep 2019 23:42:02 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 13003 invoked by uid 99); 26 Sep 2019 23:42:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Sep 2019 23:42:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EBC9DE2CE6 for ; Thu, 26 Sep 2019 23:42:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 10F4E7803DE for ; Thu, 26 Sep 2019 23:42:00 +0000 (UTC) Date: Thu, 26 Sep 2019 23:42:00 +0000 (UTC) From: "Sophie Blee-Goldman (Jira)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (KAFKA-8951) Avoid unnecessary rebalances and downtime for "safe" partitions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Sophie Blee-Goldman created KAFKA-8951: ------------------------------------------ Summary: Avoid unnecessary rebalances and downtime for "safe" partitions Key: KAFKA-8951 URL: https://issues.apache.org/jira/browse/KAFKA-8951 Project: Kafka Issue Type: Improvement Components: clients, streams Reporter: Sophie Blee-Goldman With cooperative rebalancing, any partition that is encoded in one consumer's Subscription cannot be re-assigned to a different consumer during that rebalance. The partition must be removed from the assignment and revoked by its old owner before triggering a second rebalance during which it can be assigned. This is to enforce a synchronization barrier so that no two consumers can ever own the same partition at the same time This leads to down time for that partition plus a second rebalance, which may not always be necessary. In Streams for example, the consumer will pause all partitions of an active task until it is running (ie has been initialized and restored). It should be safe to give these partitions away, provided they are not resumed between sending the joinGroup request and receiving the syncGroup response. One proposal would be to modify two methods in the ConsumerPartitionAssignor interface. 1) ConsumerPartitionAssignor#subscriptionUserData would be passed in the set of `ownedPartitions` that will be included in the subscription, allowing it to remove any that it knows are safe to give away. 2) ConsumerPartitionAssignor#onAssignment would be passed the set of revoked partitions, allowing it to remove any that it knows were already reassigned and should not trigger another rebalance. -- This message was sent by Atlassian Jira (v8.3.4#803005)