kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dong Lin <lindon...@gmail.com>
Subject Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention
Date Mon, 10 Oct 2016 08:05:53 GMT
Hey David,

Thanks for reply. Please see comment inline.

On Mon, Oct 10, 2016 at 12:40 AM, Pengwei (L) <pengwei.li@huawei.com> wrote:

> Hi Dong
>    Thanks for the questions:
>
> 1.  Now we don't distinguish inactive or active groups. Because in some
> case maybe inactive group will become active again, and using the previous
> commit offset.
>
> So we will not delete the log segment in the consumer retention if there
> are some groups consume but not commit, but the log segment can be delete by
>      the force retention.
>

So in the example I provided, the consumed log retention will be
effectively disabled, right? This seems to be a real problem in operation
-- we don't want log retention to be un-intentionally disabled simply
because someone start a tool to consume from that topic. Either this KIP
should provide a way to handle this, or there should be a way for operator
to be aware of such case and be able to re-eanble consumed log retention
for the topic. What do you think?



> 2.  These configs are used to determine the out of date time of the
> consumed retention, like the parameters of the force retention
> (log.retention.hours, log.retention.minutes, log.retention.ms). For
> example, users want the save the log for 3 days, after 3 days, kafka will
> delete the log segments which are
>
> consumed by all the consumer group.  The log retention thread need these
> parameters.
>
> It makes sense to have configs such as log.retention.ms -- it is used to
make data available for up to a configured amount of time before it is
deleted. My question is what is the use-case for making log available for
another e.g. 3 days after it has been consumed by all consumer groups. The
purpose of this KIP is to allow log to be deleted right as long as all
interested consumer groups have consumed it. Can you provide a use-case for
keeping log available for longer time after it has been consumed by all
groups?


>
> Thanks,
> David
>
>
> > Hey David,
> >
> > Thanks for the KIP. Can you help with the following two questions:
> >
> > 1) If someone start a consumer (e.g. kafka-console-consumer) to consume a
> > topic for debug/validation purpose, a randome consumer group may be
> created
> > and offset may be committed for this consumer group. If no offset commit
> is
> > made for this consumer group in the future, will this effectively
> > disable consumed log retention for this topic? In other words, how do
> this
> > KIP distinguish active consumer group from inactive ones?
> >
> > 2) Why do we need new configs such as log.retention.commitoffset.hours?
> Can
> >we simply delete log segments if consumed log retention is enabled for
> this
> > topic and all consumer groups have consumed messages in the log segment?
> >
> > Thanks,
> > Dong
> >
> >
> >
> >On Sat, Oct 8, 2016 at 2:15 AM, Pengwei (L) <pengwei.li@huawei.com>
> wrote:
> >
> > > Hi Becket,
> > >
> > >   Thanks for the feedback:
> > > 1.  We use the simple consumer api to query the commit offset, so we
> don't
> > > need to specify the consumer group.
> > > 2.  Every broker using the simple consumer api(OffsetFetchKey) to query
> > > the commit offset in the log retention process.  The client can commit
> > > offset or not.
> > > 3.  It does not need to distinguish the follower brokers or leader
> > > brokers,  every brokers can query.
> > > 4.  We don't need to change the protocols, we mainly change the log
> > > retention process in the log manager.
> > >
> > >   One question is the query min offset need O(partitions * groups) time
> > > complexity, another alternative is to build an internal topic to save
> every
> > > partition's min offset, it can reduce to O(1).
> > > I will update the wiki for more details.
> > >
> > > Thanks,
> > > David
> > >
> > >
> > > > Hi Pengwei,
> > > >
> > > > Thanks for the KIP proposal. It is a very useful KIP. At a high
> level,
> > > the
> > > > proposed behavior looks reasonable to me.
> > > >
> > > > However, it seems that some of the details are not mentioned in the
> KIP.
> > > > For example,
> > > >
> > > > 1. How will the expected consumer group be specified? Is it through
> a per
> > > > topic dynamic configuration?
> > > > 2. How do the brokers detect the consumer offsets? Is it required
> for a
> > > > consumer to commit offsets?
> > > > 3. How do all the replicas know the about the committed offsets?
> e.g. 1)
> > > > non-coordinator brokers which do not have the committed offsets, 2)
> > > > follower brokers which do not have consumers directly consuming from
> it.
> > > > 4. Is there any other changes need to be made (e.g. new protocols) in
> > > > addition to the configuration change?
> > > >
> > > > It would be great if you can update the wiki to have more details.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Wed, Sep 7, 2016 at 2:26 AM, Pengwei (L) <pengwei.li@huawei.com>
> > > wrote:
> > > >
> > > > > Hi All,
> > > > >    I have made a KIP to enhance the log retention, details as
> follows:
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 68+Add+a+consumed+log+retention+before+log+retention
> > > > >    Now start a discuss thread for this KIP , looking forward to the
> > > > > feedback.
> > > > >
> > > > > Thanks,
> > > > > David
> > > > >
> > > > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message