kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dong Lin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-6262) KIP-232: Detect outdated metadata by adding ControllerMetadataEpoch field
Date Sat, 02 Dec 2017 02:29:01 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-6262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dong Lin updated KAFKA-6262:
----------------------------
    Summary: KIP-232: Detect outdated metadata by adding ControllerMetadataEpoch field  (was:
Consumer should not use metadata that is older than the existing metadata)

> KIP-232: Detect outdated metadata by adding ControllerMetadataEpoch field
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-6262
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6262
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>
> Currently the following sequence of events may happen that cause consumer to rewind back
to the earliest offset even if there is no log truncation in Kafka. This can be a problem
for MM by forcing MM to lag behind significantly and duplicate a large amount of data.
> - Say there are three brokers 1,2,3 for a given partition P. Broker 1 is the leader.
Initially they are all in ISR. HW and LEO are both 10.
> - SRE does controlled shutdown for broker 1. Controller sends LeaderAndIsrRequest to
all three brokers so that leader = broker 2 and isr_set = [broker 2, broker 3].
> - Broker 2 and 3 receives and processes LeaderAndIsrRequest almost instantaneously. Now
broker 2 and broker 3 can accept ProduceRequest and FetchRequest for the partition P. 
> However, broker 1 has not processed this LeaderAndIsrRequest due to backlog in its request
queue. So broker 1 still think it is leader for the partition P.
> - Because there is leadership movement, a consumer receives NotLeaderForPartitionException,
which triggers this consumer to send MetadataRequest to a randomly selected broker, say broker
2. Broker 2 tells consumer that itself is the leader for partition P. Consumer fetches date
of partition P from broker 2. The latest data has offset 20.
> - Later this consumer receives NotLeaderForPartitionException for another partition.
It sends MetadataRequest to a randomly selected broker again. This time it sends MetadataRequest
to broker 1, which tells the consumer that itself is the leader for partition P.
> - This consumer issues FetchRequest for the partition P at offset 21. Broker 1 returns
OffsetOutOfRangeExeption because it thinks the LogEndOffset for this partition is 10.
> There are two possible solutions for this problem. The long term solution is probably
to include version in the MetadataResponse so that consumer knows whether the medata is outdated.
This requires a KIP.
> The short term solution, which should solve the problem in most cases, is to let consumer
keep fetching metadata from the same (initially randomly picked) broker until the connection
to this broker is disconnected. The metadata version will not go back in time if consumer
keeps fetching metadata from the same broker.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message