kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ismael Juma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-4298) LogCleaner does not convert compressed message sets properly
Date Fri, 14 Oct 2016 03:58:20 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ismael Juma updated KAFKA-4298:
-------------------------------
    Description: 
When cleaning the log, we don't want to convert messages to the format configured for the
topic due to KAFKA-3915. However, the cleaner logic for writing compressed messages (in case
some messages in the message set were not retained) writes the topic message format version
in the magic field of the outer message instead of the actual message format. The choice of
the absolute/relative offset for the inner messages will also be based on the topic message
format version.

For example, if there is an old compressed message set with magic=0 in the log and the topic
is configured for magic=1, then after cleaning, the new message set will have a wrapper with
magic=1, the nested messages will still have magic=0, but the message offsets will be relative.
If this happens, there does not seem to be an easy way to recover without manually fixing
up the log.

The offsets still work correctly as both the clients and broker use the outer message format
version to decide if the relative offset needs to be converted to an absolute offset. So the
main problem turns out to be that `ByteBufferMessageSet.deepIterator` throws an exception
if there is a mismatch between outer and inner message format version.

{code}
if (newMessage.magic != wrapperMessage.magic)
          throw new IllegalStateException(s"Compressed message has magic value ${wrapperMessage.magic}
" +
            s"but inner message has magic value ${newMessage.magic}")
{code}

  was:When cleaning the log, we attempt to write the cleaned messages using the message format
configured for the topic, but as far as I can tell, we do not convert the wrapped messages
in compressed message sets. For example, if there is an old compressed message set with magic=0
in the log and the topic is configured for magic=1, then after cleaning, the new message set
will have a wrapper with magic=1, but the nested messages will still have magic=0. If this
happens, there does not seem to be an easy way to recover without manually fixing up the log.


> LogCleaner does not convert compressed message sets properly
> ------------------------------------------------------------
>
>                 Key: KAFKA-4298
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4298
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.0.1
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Critical
>             Fix For: 0.10.1.0, 0.10.0.2
>
>
> When cleaning the log, we don't want to convert messages to the format configured for
the topic due to KAFKA-3915. However, the cleaner logic for writing compressed messages (in
case some messages in the message set were not retained) writes the topic message format version
in the magic field of the outer message instead of the actual message format. The choice of
the absolute/relative offset for the inner messages will also be based on the topic message
format version.
> For example, if there is an old compressed message set with magic=0 in the log and the
topic is configured for magic=1, then after cleaning, the new message set will have a wrapper
with magic=1, the nested messages will still have magic=0, but the message offsets will be
relative. If this happens, there does not seem to be an easy way to recover without manually
fixing up the log.
> The offsets still work correctly as both the clients and broker use the outer message
format version to decide if the relative offset needs to be converted to an absolute offset.
So the main problem turns out to be that `ByteBufferMessageSet.deepIterator` throws an exception
if there is a mismatch between outer and inner message format version.
> {code}
> if (newMessage.magic != wrapperMessage.magic)
>           throw new IllegalStateException(s"Compressed message has magic value ${wrapperMessage.magic}
" +
>             s"but inner message has magic value ${newMessage.magic}")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message