kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access
Date Sun, 05 Nov 2017 07:22:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239428#comment-16239428
] 

ASF GitHub Bot commented on KAFKA-6172:
---------------------------------------

GitHub user lindong28 opened a pull request:

    https://github.com/apache/kafka/pull/4177

    KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lindong28/kafka KAFKA-6172

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/4177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4177
    
----
commit 6a413ce9f233c3554450e006da885e8435e56502
Author: Dong Lin <lindong28@gmail.com>
Date:   2017-11-05T07:20:35Z

    KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access

----


> Cache lastEntry in TimeIndex to avoid unnecessary disk access
> -------------------------------------------------------------
>
>                 Key: KAFKA-6172
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6172
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>
> LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a number of
calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() involves disk seek operation
because it tries to read the content of the last few bytes of the index files on the disk.
This slows down the broker shutdown process.
> For a given broker of 6k partitions and 19k segments, we find that LogManager.shutdown()
takes 15 minutes. The broker is configured to use 10 threads to close log in parallel. According
to the thread dump taken while the broker is in the process of LogManager.shutdown(), roughly
5 out of the 10 threads are in RUNNABLE state at TimeIndex.lastEntry(). This suggests that
TimeIndex.lastEntry() is very likely costing a lot of shutdown time.
> This patch intends to reduce the broker shutdown time by caching the lastEntry in memory
so that broker does not have to always read disk to get the lastEntry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message