kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1001) Handle follower transition in batch
Date Mon, 21 Oct 2013 20:37:45 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801048#comment-13801048

Guozhang Wang commented on KAFKA-1001:

Updated reviewboard https://reviews.apache.org/r/14730/

> Handle follower transition in batch
> -----------------------------------
>                 Key: KAFKA-1001
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1001
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jay Kreps
>            Assignee: Guozhang Wang
>             Fix For: 0.8.1
>         Attachments: KAFKA-1001_2013-10-21_13:35:41.patch, KAFKA-1001.patch
> In KAFKA-615 we made changes to avoid fsync'ing the active segment of the log due to
log roll and maintaining recovery semantics.
> One downside of the fix for that issue was that it required checkpointing the recovery
point for the log many times, one for each partition that transitioned to follower state.
> In this ticket I aim to fix that issue by making the following changes:
> 1. Add a new API LogManager.truncateTo(m: Map[TopicAndPartition, Long]). This method
will first checkpoint the recovery point, then truncate each of the given logs to the given
offset. This method will have to ensure these two things happen atomically.
> 2. Change ReplicaManager to first stop fetching for all partitions changing to follower
state, then call LogManager.truncateTo then complete the existing logic.
> We think this will, over all, be a good thing. The reason is that the fetching thread
current does something like (a) acquire lock, (b) fetch partitions, (c) write data to logs,
(d) release locks. Since we currently remove fetchers one at a time this requires acquiring
the fetcher lock, and hence generally blocking for half of the read/write cycle for each partition.
By doing this in bulk we will avoid reacquiring the lock over and over for each change.

This message was sent by Atlassian JIRA

View raw message