kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <g...@confluent.io>
Subject Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable
Date Tue, 17 May 2016 04:47:44 GMT
I see what you mean, Eric.

I was unclear on the specifics of your architecture. It sounds like
you have a table somewhere that maps checkpoints to lists of
<topicPartition, offset>.
In that case it is indeed useful to know that if the checkpoint was
written N ms ago, you will be able to find the exact offsets by
looking at the log.

Reading ahead won't really help in that case, since it sounds like the
state is too large to maintain in memory while reading ahead to a
future checkpoint.
(Different from Jay's abstract case in that regard).

Gwen


On Mon, May 16, 2016 at 9:21 PM, Eric Wasserman
<eric.wasserman@gmail.com> wrote:
> Gwen,
>
> For simplicity, the example I gave in the gist is for a single table with a single partition.
The salient point is that even for a single topic with one partition there is no guarantee
without the feature that one will be able to restore some particular checkpoint as the offset
indicated by that checkpoint may have been compacted away.
>
> The practical reality is we are trying to restore the state of a database with nearly
1000 tables each of which has 8 partitions. In this real case there are 8000 offsets indicated
in each checkpoint. If even a single one of those 8000 is compacted the checkpointed state
cannot be reconstructed.
>
> Additionally, we don't really intend to have the consumers of the table topics try to
keep current. Rather they will occasionally (say at 1AM each day) try to build the state of
the database at a recent checkpoint (say from midnight). Supposing this takes a bit of time
(10's of minutes to hours) to read all the partitions of all the table topics up each to its
target offset indicated in the midnight checkpoint. By the time all the consumers have arrive
at the designated offset perhaps one of them will have had its target offset compacted away.
We would then need to select a new target checkpoint with its offsets for each topic and partition
that is a bit later. How much later? It might well be around the 10's of minutes to hours
it took to read through to the offsets of the original target checkpoint as the compaction
that foiled us may have occurred just before we reached the goal.
>
> Really the issue is that while without the feature while we could eventually restore
_some_ consistent state we couldn't be assured of being able to restore any
> particular (recent) one. My comment about never being assured of the process terminating
is just acknowledging the perhaps small but nonetheless finite possibility of the process
of chasing the checkpoints looking for which no partition has yet had its target offset compacted
away could continue indefinitely. There is really no condition in which one could be absolutely
guaranteed this process would terminate.
>
> The feature addresses this by providing a guarantee that _any_ checkpoint can be reconstructed
as long as it is within the compaction lag. I would love to be convinced that I am in error
but short of that I frankly would never turn on compaction for a CDC use case without it.
>
> As to reducing the number of parameters. I personally only see the min.compaction.lag.ms
as being truly essential. Even the existing ratio setting is secondary in my mind.
>
> Eric
>
>> On May 16, 2016, at 6:42 PM, Gwen Shapira <gwen@confluent.io> wrote:
>>
>> Hi Eric,
>>
>> Thank you for submitting this improvement suggestion.
>>
>> Do you mind clarifying the use-case for me?
>>
>> Looking at your gist: https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
>>
>> If my consumer started reading all the CDC topics from the very
>> beginning in which they were created, without ever stopping, it is
>> obviously guaranteed to see every single consistent state of the
>> database.
>> If my consumer joined late (lets say after Tq got clobbered by Tr) it
>> will get a mixed state, but if it will continue listening on those
>> topics, always following the logs to their end, it is guaranteed to
>> see a consistent state as soon a new transaction commits. Am I missing
>> anything?
>>
>> Basically, I do not understand why you claim: "However, to recover all
>> the tables at the same checkpoint, with each independently compacting,
>> one may need to move to an even more recent checkpoint when a
>> different table had the same read issue with the new checkpoint. Thus
>> one could never be assured of this process terminating."
>>
>> I mean, it is true that you need to continuously read forward in order
>> to get to a consistent state, but why can't you be assured of getting
>> there?
>>
>> We are doing something very similar in KafkaConnect, where we need a
>> consistent view of our configuration. We make sure that if the current
>> state is inconsistent (i.e there is data that are not "committed"
>> yet), we continue reading to the log end until we get to a consistent
>> state.
>>
>> I am not convinced the new functionality is necessary, or even helpful.
>>
>> Gwen
>>
>> On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman
>> <eric.wasserman@gmail.com> wrote:
>>> I would like to begin discussion on KIP-58
>>>
>>> The KIP is here:
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
>>>
>>> Jira: https://issues.apache.org/jira/browse/KAFKA-1981
>>>
>>> Pull Request: https://github.com/apache/kafka/pull/1168
>>>
>>> Thanks,
>>>
>>> Eric
>

Mime
View raw message