kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Kreps (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-727) broker can still expose uncommitted data to a consumer
Date Wed, 23 Jan 2013 05:42:14 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560410#comment-13560410
] 

Jay Kreps commented on KAFKA-727:
---------------------------------

Fantastic catch.

I think another fix is to just save the size of the log prior to translating the hw mark and
use this rather than dynamically checking log.sizeInBytes later in the method. This will effectively
act as a valid lower bound.

It might also be worthwhile to write a throw away torture test that has one thread do appends
and another thread do reads and check that this condition is not violated in case there are
any more of these subtleties. 

Happy to take this one on since it is my bad.
                
> broker can still expose uncommitted data to a consumer
> ------------------------------------------------------
>
>                 Key: KAFKA-727
>                 URL: https://issues.apache.org/jira/browse/KAFKA-727
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Priority: Blocker
>
> Even after kafka-698 is fixed, we still see consumer clients occasionally see uncommitted
data. The following is how this can happen.
> 1. In Log.read(), we pass in startOffset < HW and maxOffset = HW.
> 2. Then we call LogSegment.read(), in which we call translateOffset on the maxOffset.
The offset doesn't exist and translateOffset returns null.
> 3. Continue in LogSegment.read(), we then call messageSet.sizeInBytes() to fetch and
return the data.
> What can happen is that between step 2 and step 3, a new message is appended to the log
and is not committed yet. Now, we have exposed uncommitted data to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message