kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Wasserman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-1981) Make log compaction point configurable
Date Mon, 09 May 2016 04:24:13 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275920#comment-15275920

Eric Wasserman commented on KAFKA-1981:

That did it. Thanks. I created https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable

> Make log compaction point configurable
> --------------------------------------
>                 Key: KAFKA-1981
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1981
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions:
>            Reporter: Jay Kreps
>              Labels: newbie++
>         Attachments: KIP for Kafka Compaction Patch.md
> Currently if you enable log compaction the compactor will kick in whenever you hit a
certain "dirty ratio", i.e. when 50% of your data is uncompacted. Other than this we don't
give you fine-grained control over when compaction occurs. In addition we never compact the
active segment (since it is still being written to).
> Other than this we don't really give you much control over when compaction will happen.
The result is that you can't really guarantee that a consumer will get every update to a compacted
topic--if the consumer falls behind a bit it might just get the compacted version.
> This is usually fine, but it would be nice to make this more configurable so you could
set either a # messages, size, or time bound for compaction.
> This would let you say, for example, "any consumer that is no more than 1 hour behind
will get every message."
> This should be relatively easy to implement since it just impacts the end-point the compactor
considers available for compaction. I think we already have that concept, so this would just
be some other overrides to add in when calculating that.

This message was sent by Atlassian JIRA

View raw message