kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michal Turek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes
Date Thu, 09 Jun 2016 07:46:21 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15322097#comment-15322097
] 

Michal Turek commented on KAFKA-3806:
-------------------------------------

Hi Jun and James,

I probably don't see all the assumptions and consequences but I feel there is something wrong.
I consider the committed offsets to be only tiny metadata for huge log data. Each committed
offset is in its nature only one single number + identification of topic-partition + identification
of consumer group. It was exactly this when it was stored in ZooKeeper. The new approach -
storing of the offsets by writing to a special Kafka topic is "only" an implementation detail
:-). The topic may store a lot of subsequent commits, but each offset commit invalidate and
fully overwrite all previous ones. I thought the Kafka's log compaction feature (https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction)
apply here and only the last commit will survive the compaction, so there is nearly no storage
overhead long term. Am I wrong? Please correct me if so.

Default log.retention.hours = 7 days is pretty fine, but I would expect default for offsets.retention.minutes
to be half a year or so. Remember it is basically only reasonably small group of tiny numbers.
Can you explain me, what is the reason to have offsets.retention.minutes so small, only 1
day? What will be the consequences if we configure it to one month or one year? Will something
wrong happen? I feel it's obvious for you but I don't see anything.

I fully agree there should be some TTL expiration for very old "dead" values to be able to
GC them and free resources. Even the tiny metadata may grow in time. But "one day" doesn't
belong to the "very old" category for me at all.

If prolonging offsets.retention.minutes was dangerous, would it be possible to prevent deletion
of the committed offsets in case that the topic still exists and the consumer group is active
or was active during offsets.retention.minutes timeout? I don't know Kafka code, but I would
expect behavior like this to reliably prevent any (meta)data loss.

Thanks for explanation!
Michal

> Adjust default values of log.retention.hours and offsets.retention.minutes
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-3806
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3806
>             Project: Kafka
>          Issue Type: Improvement
>          Components: config
>    Affects Versions: 0.9.0.1, 0.10.0.0
>            Reporter: Michal Turek
>            Priority: Minor
>
> Combination of default values of log.retention.hours (168 hours = 7 days) and offsets.retention.minutes
(1440 minutes = 1 day) may be dangerous in special cases. Offset retention should be always
greater than log retention.
> We have observed the following scenario and issue:
> - Producing of data to a topic was disabled two days ago by producer update, topic wasn't
deleted.
> - Consumer consumed all data and properly committed offsets to Kafka.
> - Consumer made no more offset commits for that topic because there was no more incoming
data and there was nothing to confirm. (We have auto-commit disabled, I'm not sure how behaves
enabled auto-commit.)
> - After one day: Kafka cleared too old offsets according to offsets.retention.minutes.
> - After two days: Long-term running consumer was restarted after update, it didn't find
any committed offsets for that topic since they were deleted by offsets.retention.minutes
so it started consuming from the beginning.
> - The messages were still in Kafka due to larger log.retention.hours, about 5 days of
messages were read again.
> Known workaround to solve this issue:
> - Explicitly configure log.retention.hours and offsets.retention.minutes, don't use defaults.
> Proposals:
> - Prolong default value of offsets.retention.minutes to be at least twice larger than
log.retention.hours.
> - Check these values during Kafka startup and log a warning if offsets.retention.minutes
is smaller than log.retention.hours.
> - Add a note to migration guide about differences between storing of offsets in ZooKeeper
and Kafka (http://kafka.apache.org/documentation.html#upgrade).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message