Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B9EE17D59 for ; Tue, 4 Nov 2014 21:33:35 +0000 (UTC) Received: (qmail 99640 invoked by uid 500); 4 Nov 2014 21:33:35 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 99592 invoked by uid 500); 4 Nov 2014 21:33:35 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 99577 invoked by uid 99); 4 Nov 2014 21:33:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2014 21:33:35 +0000 Date: Tue, 4 Nov 2014 21:33:35 +0000 (UTC) From: "Gwen Shapira (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (KAFKA-1555) provide strong consistency with reasonable availability MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwen Shapira updated KAFKA-1555: -------------------------------- Attachment: KAFKA-1555-DOCS.4.patch Integrated [~jjkoshy] comment. Regarding the correct behavior when acks={0,1}, I suggest taking this to another JIRA or to the mailing list. I think you have a good point, but I'd like to hear from others in the community too since it may lead to unexpected behavior. > provide strong consistency with reasonable availability > ------------------------------------------------------- > > Key: KAFKA-1555 > URL: https://issues.apache.org/jira/browse/KAFKA-1555 > Project: Kafka > Issue Type: Improvement > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Jiang Wu > Assignee: Gwen Shapira > Fix For: 0.8.2 > > Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555-DOCS.4.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch > > > In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: > 1. When 1 broker is down, no message loss or service blocking happens. > 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. > We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: > 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. > 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. > 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. > The following is an analytical proof. > We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. > According to the value of request.required.acks (acks for short), there are the following cases. > 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. > 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. > 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. > In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)