kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5036) Followups from KIP-101
Date Mon, 10 Apr 2017 13:04:41 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962819#comment-15962819
] 

ASF GitHub Bot commented on KAFKA-5036:
---------------------------------------

GitHub user benstopford opened a pull request:

    https://github.com/apache/kafka/pull/2831

    MINOR: KAFKA-5036 (points 2, 5): Refactor caching of Latest Epoch

    This PR covers point (2) and point (5) from KAFKA-5036:
    2. Currently, we update the leader epoch in epochCache after log append in the follower
but before log append in the leader. It would be more consistent to always do this after log
append. This also avoids issues related to failure in log append.
    5. The constructor of LeaderEpochFileCache has the following:
    lock synchronized { ListBuffer(checkpoint.read(): _*) }
    But everywhere else uses a read or write lock. We should use consistent locking.
    
    This is a refactor to the way epochs are cached, replacing the code to cache the latest
epoch in the LeaderEpochFileCache by reusing the cached value in Partition. There is no functional
change. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/benstopford/kafka KAFKA-5036-part2-second-try

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2831.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2831
    
----
commit 3e9c130672824070968173b2991a43eb9fa139b6
Author: Ben Stopford <benstopford@gmail.com>
Date:   2017-04-10T12:56:48Z

    KAFKA-5036: Refactor the caching of the latest epoch. Workflow is simpler if we resuse
the value cached in partition.

----


> Followups from KIP-101
> ----------------------
>
>                 Key: KAFKA-5036
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5036
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.11.0.0
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.11.0.0
>
>
> 1. It would be safer to hold onto the leader lock in Partition while serving an OffsetForLeaderEpoch
request.
> 2. Currently, we update the leader epoch in epochCache after log append in the follower
but before log append in the leader. It would be more consistent to always do this after log
append. This also avoids issues related to failure in log append.
> 3. OffsetsForLeaderEpochRequest/OffsetsForLeaderEpochResponse:
> The code that does grouping can probably be replaced by calling CollectionUtils.groupDataByTopic().
Done: https://github.com/apache/kafka/commit/359a68510801a22630a7af275c9935fb2d4c8dbf
> 4. The following line in LeaderEpochFileCache is hit several times when LogTest is executed:
> {code}
>        if (cachedLatestEpoch == None) error("Attempt to assign log end offset to epoch
before epoch has been set. This should never happen.")
> {code}
> 5. The constructor of LeaderEpochFileCache has the following:
> {code}
> lock synchronized { ListBuffer(checkpoint.read(): _*) }
> {code}
> But everywhere else uses a read or write lock. We should use consistent locking.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message