kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Rao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4099) Change the time based log rolling to only based on the message timestamp.
Date Fri, 21 Oct 2016 05:15:59 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15594095#comment-15594095
] 

Jun Rao commented on KAFKA-4099:
--------------------------------

[~becket_qin], Ewen brought up another example that may still lead to undesirable behavior
with log rolling. Suppose that you have 2 producers, one producing data with the current timestamp
and another producing data with timestamp 7 days old (e.g., if some data is delayed or some
old data is replayed), this will still cause the log segments to roll frequently. This may
not be common, but can definitely happen. So, it seems we will still need to improve on how
log rolls.

> Change the time based log rolling to only based on the message timestamp.
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-4099
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4099
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>             Fix For: 0.10.1.0
>
>
> This is an issue introduced in KAFKA-3163. When partition relocation occurs, the newly
created replica may have messages with old timestamp and cause the log segment rolling for
each message. The fix is to change the log rolling behavior to only based on the message timestamp
when the messages are in message format 0.10.0 or above. If the first message in the segment
does not have a timetamp, we will fall back to use the wall clock time for log rolling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message