kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From radai <radai.rosenbl...@gmail.com>
Subject Re: [DISCUSS] KIP-107: Add purgeDataBefore() API in AdminClient
Date Wed, 04 Jan 2017 01:29:15 GMT
the issue with tracking committed offsets is whos offsets do you track?

1. some topics have multiple groups
2. some "groups" are really one-offs like developers spinning up console
consumer "just to see if there's data"
3. there are use cases where you want to deliberately "wipe" data EVEN IF
its still being consumed

#1 is a configuration mess, since there are multiple possible strategies.
#2 is problematic without a definition of "liveliness" or special handling
for console consumer? and #3 is flat out impossible with committed-offset
tracking

On Tue, Jan 3, 2017 at 3:56 PM, Ewen Cheslack-Postava <ewen@confluent.io>
wrote:

> Dong,
>
> Looks like that's an internal link,
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 107%3A+Add+purgeDataBefore%28%29+API+in+AdminClient
> is the right one.
>
> I have a question about one of the rejected alternatives:
>
> > Using committed offset instead of an extra API to trigger data purge
> operation.
>
> The KIP says this would be more complicated to implement. Why is that? I
> think brokers would have to consume the entire offsets topic, but the data
> stored in memory doesn't seem to change and applying this when updated
> offsets are seen seems basically the same. It might also be possible to
> make it work even with multiple consumer groups if that was desired
> (although that'd require tracking more data in memory) as a generalization
> without requiring coordination between the consumer groups. Given the
> motivation, I'm assuming this was considered unnecessary since this
> specifically targets intermediate stream processing topics.
>
> Another question is why expose this via AdminClient (which isn't public API
> afaik)? Why not, for example, expose it on the Consumer, which is
> presumably where you'd want access to it since the functionality depends on
> the consumer actually having consumed the data?
>
> -Ewen
>
> On Tue, Jan 3, 2017 at 2:45 PM, Dong Lin <lindong28@gmail.com> wrote:
>
> > Hi all,
> >
> > We created KIP-107 to propose addition of purgeDataBefore() API in
> > AdminClient.
> >
> > Please find the KIP wiki in the link https://iwww.corp.linkedin.
> > com/wiki/cf/display/ENGS/Kafka+purgeDataBefore%28%29+
> API+design+proposal.
> > We
> > would love to hear your comments and suggestions.
> >
> > Thanks,
> > Dong
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message