kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Rao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
Date Wed, 08 Oct 2014 00:07:35 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162833#comment-14162833

Jun Rao commented on KAFKA-1555:

The patch that Gwen provided (using a min.isr topic level config) looks good to me (other
than a few minor comments). If anyone else is interested in reviewing, please take another
look. If there is no objection, I will most likely commit the patch once the remaining minor
comments are resolved.

> provide strong consistency with reasonable availability
> -------------------------------------------------------
>                 Key: KAFKA-1555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions:
>            Reporter: Jiang Wu
>            Assignee: Gwen Shapira
>             Fix For: 0.8.2
>         Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch,
KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch
> In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy
two requirements:
> 1. When 1 broker is down, no message loss or service blocking happens.
> 2. In worse cases such as two brokers are down, service can be blocked, but no message
loss happens.
> We found that current kafka versoin ( cannot achieve the requirements due to
its three behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less messages may
be chosen as the leader.
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less
messages than the leader.
> 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only
1 broker.
> The following is an analytical proof. 
> We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at
the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the
same messages and are all in ISR.
> According to the value of request.required.acks (acks for short), there are the following
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although
C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader,
and consumers will miss m.
> 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A,
and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message
m is lost.
> In summary, any existing configuration cannot satisfy the requirements.

This message was sent by Atlassian JIRA

View raw message