kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emanuele Cesena (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-4666) Failure test for Kafka configured for consistency vs availability
Date Tue, 17 Jan 2017 20:15:26 GMT
Emanuele Cesena created KAFKA-4666:
--------------------------------------

             Summary: Failure test for Kafka configured for consistency vs availability
                 Key: KAFKA-4666
                 URL: https://issues.apache.org/jira/browse/KAFKA-4666
             Project: Kafka
          Issue Type: Improvement
            Reporter: Emanuele Cesena
         Attachments: consistency_test.py

We recently had an issue with our Kafka setup because of a misconfiguration.

In short, we thought we have configured Kafka for durability, but we didn't set the producers
to acks=all. During a full outage, we had situations where some partitions were "partitioned",
meaning that the followers started without properly waiting for the right leader, and thus
we lost data. Again, this is not an issue with Kafka, but a misconfiguration on our side.

I think we reproduced the issue, and we built a docker test that proves that, if the producer
isn't set with acks=all, then data can be lost during an almost full outage. The test is attached.

I was thinking to send a PR, but wanted to run this through you first, as it's not necessarily
proving that a feature works as expected.

In addition, I think the documentation could be slightly improved, for instance in the section:
http://kafka.apache.org/documentation/#design_ha
by clearly stating that there are 3 steps one should do for configuring kafka for consistency,
the third being that producers should be set with acks=all (which is now part of the 2nd point).

Please let me know what do you think, and I can send a PR if you agree.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message