Mailing-List: contact users-help@kafka.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@kafka.apache.org
Received-SPF: pass (nike.apache.org: domain of felix.giguere@mate1inc.com
 designates 74.125.245.72 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABvsF6skLTYTq5G=C2uim3TXXeZ1awNjFpWEi+OPVQk0APiK7g@mail.gmail.com>
References: 
 <CAG2rwuPWF7NhoMAwed7KQvwKt1usoRWPAv7ox5FP6-9V-V_X=Q@mail.gmail.com>
 <CAECHK7tpzm=xjBrPzotz4g-QAF=O-Hg4_3mtoO3LjsFUvXkNgA@mail.gmail.com>
 <CABvsF6skLTYTq5G=C2uim3TXXeZ1awNjFpWEi+OPVQk0APiK7g@mail.gmail.com>
From: Felix GV <felix@mate1inc.com>
Date: Tue, 15 Jan 2013 00:24:17 -0500
Message-ID: 
 <CAECHK7sBFZEzX5N227=3EZcw-RfSb3NBb4DdwO=1Sy4JMCkSfQ@mail.gmail.com>
Subject: Re: Is this a good overview of kafka?
To: users@kafka.apache.org, stan.rosenberg@gmail.com
Content-Type: multipart/alternative; boundary=047d7b6d9624547f1704d34cfa6b

--047d7b6d9624547f1704d34cfa6b
Content-Type: text/plain; charset=ISO-8859-1

Sure, I'll try to give a better explanation :)

Little disclaimer though: My knowledge is based on my reading of the Kafka
design paper <http://kafka.apache.org/design.html> more than a year ago, so
right off the bat, it's possible that I may be forgetting or assuming
things which I shouldn't... Also, Kafka was pre-0.7 at the time, and we've
been running 0.7.0-ish in prod for a while now, so it's possible that some
of my understanding is outdated in the context of 0.7.2, and there are
definitely a fait bit of things that changed in 0.8 but I don't know what
changed well enough to make informed statements about 0.8. All that to say
that you should take your version of Kafka into account. And it certainly
doesn't hurt to read the design paper either ;)

So, my understanding is that when a Kafka broker comes online:

   - The broker contacts the ZK ensemble and registers itself. It also
   registers partitions for each of to the topics that exist in ZK (and
   according to the settings its own broker config file).
   - Producers are watching the online partitions in ZK, and when it
   changes, ZK fires off an event to them so that they can update their
   partition count. This partition count is used as a modulo on the hash
   returned by the brokers' partitioning function. So even if you have a
   custom partitioning function that deterministically gives out the same hash
   for a given bucket of messages, if you apply a different modulo to that
   hash, then on course it's going to make the messages of that bucket go to a
   different partition. This is done so that all online partitions get to have
   some data.
   - Consumers are also watching the online partitions in ZK. When it
   changes, ZK fires off an event to them, and they start re-balancing, so
   that the partitions are spread as fairly as possible between the consumers.
   In that process, partitions are assigned to consumers, and those
   partition-assignments could (and may very well) be different than the ones
   that were in place before the re-balance.

When a Kafka broker goes offline, if also affects the online partition
count, so the producers will again send their messages to different
partitions (so that all messages have somewhere to go) and the consumers
will re-balance again (to prevent starving a consumer whose partitions
became unavailable).

When a consumer goes online:

   - The consumer registers itself in ZK using its consumer group.
   - If there are other consumers watching that consumer group, then they
   will get notified and a re-balance of the whole group will be triggered,
   just like in the above case.

When a consumer goes offline, a re-balance is triggered as well for the
same reasons.

In the case of consumers going online or offline, this does not change the
ordering guarantees within the partitions per say. BUT, if your consumers
were keeping any sort of internal state in relation to the ordered data
they were consuming, then that state won't be relevant anymore, because
they will start consuming form different partitions after the rebalance.
Depending on the type of processing you're doing, that may or may not break
the work your consumer is doing.

Thus, the only event that does has no chance of affecting the stickiness of
a (data bucket ==> consumer process), is producers going online or offline.
Broker changes definitely alter which message buckets go into which
partitions. Consumer changes don't affect the content of partitions, but it
does change which consumer is consuming which partition.

If ordering guarantees are important to you, then I guess the best thing to
do might be to add watches on the same type of stuff that triggers the
changes described above, and to act accordingly when those changes happen
(by flushing the internal state, restarting the consumers, rolling back the
ZK offset to some checkpoint in the past, or whatever else is relevant in
your use case...)

Hopefully that was clear (and accurate) enough...!

--
Felix


On Mon, Jan 14, 2013 at 9:38 PM, Stan Rosenberg <stan.rosenberg@gmail.com>wrote:

> Hi Felix,
>
> Would you mind elaborating on what you said regarding the ordering
> guaranteed; inlined below.
>
> Thanks,
>
> stan
>
> On Mon, Jan 14, 2013 at 6:08 PM, Felix GV <felix@mate1inc.com> wrote:
>
> >
> > For example if you partitioned using a User ID field within the messages,
> > you would be
> > guaranteed that all messages pertaining to a certain user would end up in
> > the same partition, and that they would be correctly ordered. You should
> be
> > aware, however, that this guarantee is only maintained as long as there
> are
> > no consumer re-balance (which happens when adding or removing a consumer
> or
> > a broker).
> >
>
> Why would consumer re-balance or broker failure alter the above partition
> invariant?
>

--047d7b6d9624547f1704d34cfa6b--