kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huxi (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not
Date Wed, 03 May 2017 02:28:04 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994177#comment-15994177
] 

huxi edited comment on KAFKA-5155 at 5/3/17 2:27 AM:
-----------------------------------------------------

This is very similar with a jira issue([kafka-4398|https://issues.apache.org/jira/browse/KAFKA-4398])
reported by me complaining of the fact that broker side cannot honor the order of timestamp.
  
Sounds like you cannot mix up the new timestamps and old timestamps based on the current design.


was (Author: huxi_2b):
This is very similar with a jira issue([kafka-4398|https://issues.apache.org/jira/browse/KAFKA-4398])
reported by me complaining of the fact that Kafka cannot broker side cannot honor the order
of timestamp.   
Sounds like you cannot mix up the new timestamps and old timestamps based on the current design.

> Messages can be deleted prematurely when some producers use timestamps and some not
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-5155
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5155
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.10.2.0
>            Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following scenario. A producer
uses timestamps and produces messages that are appended to the beginning of a log segment.
Other producer produces messages without a timestamp. In that case the largest timestamp is
made by the old messages with a timestamp and new messages with the timestamp does not influence
and the log segment with old and new messages can be delete immediately after the last new
message with no timestamp is appended. When all appended messages have no timestamp, then
they are not deleted because {{lastModified}} attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
>     val retentionMs = 10000000
>     val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
>     val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, magicValue
= 0)
>     val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
>     // append some messages to create some segments
>     log.append(old)
>     for (_ <- 0 until 12)
>       log.append(set)
>     assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = Math.max(maxTimestampSoFar, lastModified)}}
in LogSegment, or by using current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message