kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Jervis <bjer...@visibletechnologies.com>
Subject Producers errors when failing a broker in a replicated 0.8 cluster.
Date Mon, 11 Feb 2013 17:27:23 GMT
We are in final testing of Kafka and so far the fail-over tests have been pretty encouraging.
 If we kill (-9) one of two kafka brokers, with replication factor=2 we see a flurry of activity
as the producer fails and retries its writes (we use a bulk, synchronous send of 1000 messages
at a time, each message ~1K long).  Sometimes the library finds the newly elected leader before
returning to the application and sometimes it doesn't.  We added retry/backoff logic to our
code and we don't seem to be losing content.

However, we have another app in the pipeline that does a fan-out from one Kafka topic to dozens
of topics.  We still use a single, synchronous, bulk send.

My question is what are the semantics of a bulk send like that, where one broker dies, but
the topic leaders have been spread across both brokers.  Do we get any feedback on which messages
went through and which were dropped because the leader just died?  For our own transactioning
we can mark messages as 'retries' if we suspect there might have been any hanky-panky, but
if we can reliably avoid extra work by not re-sending messages that we know have been delivered
we can avoid the extra work on the client side.

Thanks for any insight,

Bob Jervis | Senior Architect

Seattle | Boston | New York | London
Phone: 425.957.6075 | Fax: 781.404.5711

Follow Visibly Intelligent Blog<http://www.visibletechnologies.com/blog/>

[cid:image004.png@01CE0839.10AC52D0] <http://www.linkedin.com/company/visible-technologies>

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message