kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flavio Junqueira (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-1211) Hold the produce request with ack > 1 in purgatory until replicas' HW has larger than the produce offset
Date Tue, 02 Aug 2016 16:15:20 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404245#comment-15404245
] 

Flavio Junqueira edited comment on KAFKA-1211 at 8/2/16 4:14 PM:
-----------------------------------------------------------------

[~junrao] let me ask a few clarification questions.

# Is it right that the scenarios described here do not affect the cases in which min isr >
1 and unclean leader election is disabled? If min isr is greater than 1 and the leader is
always coming from the latest isr, then the leader can either truncate the followers or have
them fetch the missing log suffix.
# The main goal of the proposal is to have replicas in a lossy configuration (e.g. min isr
= 1, unclean leader election enabled) a leader and a follower converging to a common prefix
by choosing an offset based on a common generation. The chosen generation is the largest generation
in common between the two replicas. Is it right?
# How do we guarantee that the generation id is unique, by using zookeeper versions?
# I think there is a potential race between updating the leader-generation-checkpoint file
and appending the first message of the generation. We might be better off rolling the log
segment file and having the generation being part of the log segment file name. This way when
we start a new generation, we also start a new file and we know precisely when a message from
that generation has been appended.
# Let's consider a scenario with 3 servers A B C. I'm again assuming that it is ok to have
a single server up to ack requests. Say we have the following execution:

||Generation||A||B||C||
|1| |m1|m1|
| | |m2|m2|
|2|m3| | |
| |m4| | |

Say that now A and B start generation 3. They have no generation in common, so they start
from zero, dropping m1 and m2. Is that right? If later on C joins A and B, then it will also
drop m1 and m2, right? Given that the configuration is lossy, it doesn't wrong to do it as
all we are trying to do is to converge to a consistent state. 


was (Author: fpj):
[~junrao] let me ask a few clarification questions.

# Is it right that the scenarios described here do not affect the cases in which min isr >
1 and unclean leader election is disabled? If min isr is greater than 1 and the leader is
always coming from the latest isr, then the leader can either truncate the followers or have
them fetch the missing log suffix.
# The main goal of the proposal is to have replicas in a lossy configuration (e.g. min isr
= 1, unclean leader election enabled) a leader and a follower converging to a common prefix
by choosing an offset based on a common generation. The chosen generation is the largest generation
in common between the two replicas. Is it right?
# How do we guarantee that the generation id is unique, by using zookeeper versions?
# I think there is a potential race between updating the leader-generation-checkpoint file
and appending the first message of the generation. We might be better off rolling the log
segment file and having the generation being part of the log segment file name. This way when
we start a new generation, we also start a new file and we know precisely when a message from
that generation has been appended.
# Let's consider a scenario with 3 servers A B C. I'm again assuming that it is ok to have
a single server up to ack requests. Say we have the following execution:

{noformat}
Generation                 A                    	B                     C
1                                                              m1                  m1
                                                                m2                  m2
2                                m3
                                  m4
{noformat}

Say that now A and B start generation 3. They have no generation in common, so the start from
zero, dropping m1 and m2. Is that right? If later on C joins A and B, then it will also drop
m1 and m2, right? Given that the configuration is lossy, it doesn't wrong to do it as all
we are trying to do is to converge to a consistent state. 

> Hold the produce request with ack > 1 in purgatory until replicas' HW has larger than
the produce offset
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1211
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1211
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>             Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the followers truncate
their data before fetching from the new leader, i.e., number of in-sync replicas is just 1.
If during this time the leader has also failed then produce requests with ack >1 that have
get responded will still be lost. To avoid this scenario we would prefer to hold the produce
request in purgatory until replica's HW has larger than the offset instead of just their end-of-log
offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message