kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stan Rosenberg <stan.rosenb...@gmail.com>
Subject Re: resilient producer
Date Tue, 15 Jan 2013 22:30:14 GMT

Thanks for your insight!   More comments are below.

On Tue, Jan 15, 2013 at 3:18 PM, Jay Kreps <jay.kreps@gmail.com> wrote:

> I can't speak for all users, but at LinkedIn we don't do this. We just run
> Kafka as a high-availability system (i.e. something not allowed to be
> down). These kind of systems require more care, but we already have a
> number of such data systems. We chose this approach because local queuing
> leads to disk/data management problems on all producers (and we have
> thousands) and also late data. Late data makes aggregation very hard since
> there will always be more data coming so the aggregate ends up not matching
> the base data.

Yep, we're facing the same problem with respect to late data.  I'd like to
see alternative solutions to this problem, but I am afraid it's an
undecidable problem in general.

> This has lead us to a path of working on reliability of the service itself
> rather than a store-and-forward model.
Likewise the model itself doesn't necessarily work--as you get to thousands
> of producers, then some of those will likely go hard down if the cluster
> has non-trivial periods of non-availability, and the data you queued
> locally is gone since you have no fault-tolerance for that.

Right.  So, you're essentially trading late data for potentially lost data?

> So that was our rationale, but you could easily go the other way. There is
> nothing in kafka that prevents producer-side queueing. I could imagine two
> possible implementations:
> 1. Many people who want this are basically doing log aggregation. If this
> is the case the collector process on the machine would just pause its
> collecting if the cluster is unavailable.
> 2. Alternately it would be possible to embed the kafka log (which is a
> standalone system) in the producer and use it for journalling in the case
> of errors. Then there could be a background thread that tries to push these
> stored messages out.
> 3. One could just catch any exceptions the producer throws and implement
> (2) external to the Kafka client.

Option 2 sounds promising.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message