Return-Path: X-Original-To: apmail-helix-user-archive@minotaur.apache.org Delivered-To: apmail-helix-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 87ADFCACB for ; Thu, 4 Dec 2014 13:45:15 +0000 (UTC) Received: (qmail 83444 invoked by uid 500); 4 Dec 2014 13:45:15 -0000 Delivered-To: apmail-helix-user-archive@helix.apache.org Received: (qmail 83402 invoked by uid 500); 4 Dec 2014 13:45:15 -0000 Mailing-List: contact user-help@helix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@helix.apache.org Delivered-To: mailing list user@helix.apache.org Received: (qmail 33951 invoked by uid 99); 3 Dec 2014 18:58:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2014 18:58:34 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vlad.gm@gmail.com designates 209.85.160.180 as permitted sender) Received: from [209.85.160.180] (HELO mail-yk0-f180.google.com) (209.85.160.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2014 18:58:29 +0000 Received: by mail-yk0-f180.google.com with SMTP id 9so7112785ykp.39 for ; Wed, 03 Dec 2014 10:57:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=YsOnwq77b169m05LxYUGIaOGZsns3g/BJkKhN7H104w=; b=eFwxxXbJ0ALdh2bULR82dF43KVIEs75oJhPhiSUmED+OjOfEtmc41T2uMD8V9Eb69a GUDq+2try1k6oX7JW7pIo5kPUkTNF499R3zUuS9mVID5qxun7gBbTKX0tPLEz3GNPD87 k7OyKYzDf3eHQFXlNIvLz/OVIPmjBy2fNV8yLwUo7xpqqgCU+JmQgC+T71mPKLZyPcrW h8c3c6hPPeERBjmFxpIekllpV9oFDmz6hTluVaVjLKy+OlIrwwSEanl/XYnZOfBOpX/j XmX2iJRznEIrrZjMrKFNMmwdU4NppteDMgdzxS+wvmF/ImGmsmJhsPATOfk52vrqFv3G rrIw== MIME-Version: 1.0 X-Received: by 10.170.150.213 with SMTP id r204mr9412789ykc.48.1417633043137; Wed, 03 Dec 2014 10:57:23 -0800 (PST) Received: by 10.170.65.195 with HTTP; Wed, 3 Dec 2014 10:57:23 -0800 (PST) Date: Wed, 3 Dec 2014 10:57:23 -0800 Message-ID: Subject: managing a Kafka consumer group using Helix From: "vlad.gm@gmail.com" To: "user@helix.apache.org" Content-Type: multipart/alternative; boundary=001a113a57d2f54e280509546990 X-Virus-Checked: Checked by ClamAV on apache.org --001a113a57d2f54e280509546990 Content-Type: text/plain; charset=UTF-8 Dear all, I am sure the following question appeared inside Linkedin before :) We would like to manage a Kafka consumer group using Helix, that is have multiple consumer instances and assign topics and their partitions among the consumers automatically. The consumer group would use a whitelist to select the topics, which means that the topic/partitions list is dynamic and can change quite frequently. I can see each topic mapping to a Helix resource or, alternatively, using a single Helix resource to handle all topics. We are most likely to use a custom rebalancer in order to use throughput metrics in order to balance traffic, not partition count. Here are a few questions: 1) If we are to use a resource per topic, would we be able to later on jointly rebalance multiple resources at once? The current rebalancer callback seems to handle a single resource. Would we have to actually manage the multiple resources in the background and just use the callback when we are asked what to do with that resource? 2) If we are to put all topics and their partitions in a single resource we are likely to quickly go over the amount of data that can be stored in a ZK node. I remember that buckets can help with that problem. Can the number of buckets increase dynamically with the number of partitions? 3) How big of a problem would it be to have an environment in which the group of administered partitions changes quite often? I guess that with one resource per topic this would not be a big issue, however it might be a problem with a single resource for all topics. 4) Is there a limit on the number of resources that can be stored in a single cluster, because of the amount of data that can be stored in a single ZK node? Regards, Vlad --001a113a57d2f54e280509546990 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Dear all,

I am sure the = following question appeared inside Linkedin before :)

<= div>We would like to manage a Kafka consumer group using Helix, that is hav= e multiple consumer instances and assign topics and their partitions among = the consumers automatically. The consumer group would use a whitelist to se= lect the topics, which means that the topic/partitions list is dynamic and = can change quite frequently. I can see each topic mapping to a Helix resour= ce or, alternatively, using a single Helix resource to handle all topics. W= e are most likely to use a custom rebalancer in order to use throughput met= rics in order to balance traffic, not partition count.=C2=A0

=
Here are a few questions:
1) If we are to use a resour= ce per topic, would we be able to later on jointly rebalance multiple resou= rces at once? The current rebalancer callback seems to handle a single reso= urce. Would we have to actually manage the multiple resources in the backgr= ound and just use the callback when we are asked what to do with that resou= rce?
2) If we are to put all topics and their partitions in a sin= gle resource we are likely to quickly go over the amount of data that can b= e stored in a ZK node. I remember that buckets can help with that problem. = Can the number of buckets increase dynamically with the number of partition= s?
3) How big of a problem would it be to have an environment in = which the group of administered partitions changes quite often? I guess tha= t with one resource per topic this would not be a big issue, however it mig= ht be a problem with a single resource for all topics.
4) Is ther= e a limit on the number of resources that can be stored in a single cluster= , because of the amount of data that can be stored in a single ZK node?

Regards,
Vlad
--001a113a57d2f54e280509546990--