kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Koshy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1499) Broker-side compression configuration
Date Wed, 01 Oct 2014 01:49:34 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154209#comment-14154209

Joel Koshy commented on KAFKA-1499:

If we provide a broker-compression-enabled config: I think the problem with compaction is
less of an issue than forgetting to enable the config. i.e., I agree that if an admin forgets
to enable it and a user's topic has a compression.type override it is confusing if there are
messages with some other compression type on the broker. With log compaction though: I think
if there are heterogeneous codecs in the log then in a sense all bets are off. i.e., we can
pick and choose whatever codec we want (say, the last non-non-compression codec in a batch)
and not bother with preserving the retained message's compression codec. Besides, there is
no guarantee that a specific producer's message is the one that that will be retained.

If we do not provide a broker-compression-enabled config: The main concern I have with this
is that the most likely default is going to be NoCompressionCodec. Most people will forget
to set this when upgrading and end up with uncompressed data which could be an issue for users
with a lot of data. Even if people have alerts on disk usage and such, there will most likely
be a moderate margin (wrt typical alert thresholds) and it may not be an option to just turn
on the config at that point without doing a difficult (manual) clean up first to free up space.

So I guess we are down to picking the lesser of two evils - I'm not sure which one is less
evil though :)

Anyone have any strong preference/further critique on the pros/cons of one over the other?

> Broker-side compression configuration
> -------------------------------------
>                 Key: KAFKA-1499
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1499
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Joel Koshy
>            Assignee: Manikumar Reddy
>              Labels: newbie++
>             Fix For: 0.8.2
>         Attachments: KAFKA-1499.patch, KAFKA-1499.patch, KAFKA-1499_2014-08-15_14:20:27.patch,
KAFKA-1499_2014-08-21_21:44:27.patch, KAFKA-1499_2014-09-21_15:57:23.patch, KAFKA-1499_2014-09-23_14:45:38.patch,
KAFKA-1499_2014-09-24_14:20:33.patch, KAFKA-1499_2014-09-24_14:24:54.patch, KAFKA-1499_2014-09-25_11:05:57.patch
>   Original Estimate: 72h
>  Remaining Estimate: 72h
> A given topic can have messages in mixed compression codecs. i.e., it can
> also have a mix of uncompressed/compressed messages.
> It will be useful to support a broker-side configuration to recompress
> messages to a specific compression codec. i.e., all messages (for all
> topics) on the broker will be compressed to this codec. We could have
> per-topic overrides as well.

This message was sent by Atlassian JIRA

View raw message