Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 35B67E6C8 for ; Tue, 8 Jan 2013 20:00:13 +0000 (UTC) Received: (qmail 75252 invoked by uid 500); 8 Jan 2013 20:00:13 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 75212 invoked by uid 500); 8 Jan 2013 20:00:13 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 75198 invoked by uid 99); 8 Jan 2013 20:00:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Jan 2013 20:00:12 +0000 Date: Tue, 8 Jan 2013 20:00:12 +0000 (UTC) From: "Neha Narkhede (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (KAFKA-687) Rebalance algorithm should consider partitions from all topics MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KAFKA-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-687: -------------------------------- Affects Version/s: 0.8.1 > Rebalance algorithm should consider partitions from all topics > -------------------------------------------------------------- > > Key: KAFKA-687 > URL: https://issues.apache.org/jira/browse/KAFKA-687 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.8.1 > Reporter: Pablo Barrera > > The current rebalance step, as stated in the original Kafka paper [1], splits the partitions per topic between all the consumers. So if you have 100 topics with 2 partitions each and 10 consumers only two consumers will be used. That is, for each topic all partitions will be listed and shared between the consumers in the consumer group in order (not randomly). > If the consumer group is reading from several topics at the same time it makes sense to split all the partitions from all topics between all the consumer. Following the example, we will have 200 partitions in total, 20 per consumer, using the 10 consumers. > The load per topic could be different and the division should consider this. However even a random division should be better than the current algorithm while reading from several topics and should harm reading from a few topics with several partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira