Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@kafka.apache.org
Date: Mon, 10 Apr 2017 13:04:41 +0000 (UTC)
From: "ASF GitHub Bot (JIRA)" <jira@apache.org>
To: dev@kafka.apache.org
Message-ID: <JIRA.13062275.1491517894000.247367.1491829481629@Atlassian.JIRA>
In-Reply-To: <JIRA.13062275.1491517894000@Atlassian.JIRA>
References: <JIRA.13062275.1491517894000@Atlassian.JIRA> <JIRA.13062275.1491517894005@jira-lw-us.apache.org>
Subject: [jira] [Commented] (KAFKA-5036) Followups from KIP-101
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 10 Apr 2017 13:04:46 -0000


    [ https://issues.apache.org/jira/browse/KAFKA-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962819#comment-15962819 ] 

ASF GitHub Bot commented on KAFKA-5036:
---------------------------------------

GitHub user benstopford opened a pull request:

    https://github.com/apache/kafka/pull/2831

    MINOR: KAFKA-5036 (points 2, 5): Refactor caching of Latest Epoch

    This PR covers point (2) and point (5) from KAFKA-5036:
    2. Currently, we update the leader epoch in epochCache after log append in the follower but before log append in the leader. It would be more consistent to always do this after log append. This also avoids issues related to failure in log append.
    5. The constructor of LeaderEpochFileCache has the following:
    lock synchronized { ListBuffer(checkpoint.read(): _*) }
    But everywhere else uses a read or write lock. We should use consistent locking.
    
    This is a refactor to the way epochs are cached, replacing the code to cache the latest epoch in the LeaderEpochFileCache by reusing the cached value in Partition. There is no functional change. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/benstopford/kafka KAFKA-5036-part2-second-try

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2831.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2831
    
----
commit 3e9c130672824070968173b2991a43eb9fa139b6
Author: Ben Stopford <benstopford@gmail.com>
Date:   2017-04-10T12:56:48Z

    KAFKA-5036: Refactor the caching of the latest epoch. Workflow is simpler if we resuse the value cached in partition.

----


> Followups from KIP-101
> ----------------------
>
>                 Key: KAFKA-5036
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5036
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.11.0.0
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.11.0.0
>
>
> 1. It would be safer to hold onto the leader lock in Partition while serving an OffsetForLeaderEpoch request.
> 2. Currently, we update the leader epoch in epochCache after log append in the follower but before log append in the leader. It would be more consistent to always do this after log append. This also avoids issues related to failure in log append.
> 3. OffsetsForLeaderEpochRequest/OffsetsForLeaderEpochResponse:
> The code that does grouping can probably be replaced by calling CollectionUtils.groupDataByTopic(). Done: https://github.com/apache/kafka/commit/359a68510801a22630a7af275c9935fb2d4c8dbf
> 4. The following line in LeaderEpochFileCache is hit several times when LogTest is executed:
> {code}
>        if (cachedLatestEpoch == None) error("Attempt to assign log end offset to epoch before epoch has been set. This should never happen.")
> {code}
> 5. The constructor of LeaderEpochFileCache has the following:
> {code}
> lock synchronized { ListBuffer(checkpoint.read(): _*) }
> {code}
> But everywhere else uses a read or write lock. We should use consistent locking.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)