kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anandha L Ranganathan <analog.s...@gmail.com>
Subject Re: Log Retention: What gets deleted
Date Fri, 08 Apr 2016 21:56:13 GMT
Thanks.

I have seen this in our system would like to understand the behavior of the
log segment.

How the log segment will get deleted in the case of one of the ISR moved to
the new node.
Say for an example currently my ISR nodes {1,2,3} for the partition-0.  Due
to some reason  after 2 days the new ISR nodes are {2,3,4}.
Brokers {2,3} will contains some log segment creation date  as T1 for the
partition-0
Broker {4} has different log segment creation date as T2 for the
partition-0.

The deletion of log segment will be based on broker {4} or brokers
{2,3}.    We noticed that latest timestamp of  log segment applies and it
sometime requires more disk space than anticipated.





On Fri, Apr 8, 2016 at 1:07 PM Gwen Shapira <gwen@confluent.io> wrote:

> Yes. It is whichever is shorter :)
>
> Another clarification:
> A segment is deleted as a whole, based on the newest event in the segment.
> So if the newest event is too recent to delete, the older events in the
> segment will also be kept around.
>
> On Fri, Apr 8, 2016 at 12:52 PM, Anandha L Ranganathan <
> analog.sony@gmail.com> wrote:
>
> > Just a clarification based on Gwen's reply
> >
> > *log.segment.bytes*  - by default this property is set to 1 GB.
> > If we haven't set any value for  *log.roll.ms <http://log.roll.ms>* ,
> > again
> > by default it is set to 168 hours.  In that case  after every 1 GB, will
> it
> > roll out new log segment file ?
> >
> >
> >
> >
> >
> > <http://log.roll.ms>
> >
> > On Fri, Apr 8, 2016 at 11:32 AM Heath Ivie <hivie@autoanything.com>
> wrote:
> >
> > > Gwen,
> > >
> > > Thanks for the detailed reply.
> > >
> > > That makes it more clear for me.
> > >
> > > Heath
> > >
> > > -----Original Message-----
> > > From: Gwen Shapira [mailto:gwen@confluent.io]
> > > Sent: Tuesday, April 05, 2016 6:13 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Log Retention: What gets deleted
> > >
> > > I think you got it almost right. The missing part is that we only
> delete
> > > whole partition segments, not individual messages.
> > >
> > > As you are writing messages, every X bytes or Y milliseconds, a new
> file
> > > gets created for the partition to store new messages in. Those files
> are
> > > called segments.
> > > The segment you are currently writing to is an active segment.
> > >
> > > We will never delete an active segment, so in order to delete old
> > messages
> > > we will look for an inactive segment where the newest message is older
> > than
> > > our retention and delete the entire segment.
> > >
> > > So there are several parameters controlling when will data get deleted
> > > (I'm looking at just the time based, not the size-based):
> > > 1. log.retention.ms - how old messages should be before we consider
> them
> > > for deletion 2. log.roll.ms - how frequently we roll new segments.
> > > Messages will not get deleted before a new segment is rolled 3.
> > > log.retention.check.interval.ms - how frequently we check for segments
> > > that we can delete.
> > >
> > > A message will be deleted if all 3 are true:
> > > 1. It is older than log.retention.ms
> > > 2. It is in an inactive segment, meaning enough time passed since the
> > > message was written to roll a new segment 3. Kafka checked for segments
> > > that can be deleted, meaning that more than check.interval.ms time
> > passed
> > > since the segment was rolled.
> > >
> > > Hope this helps,
> > >
> > > Gwen
> > >
> > >
> > >
> > > On Fri, Apr 1, 2016 at 12:21 PM, Heath Ivie <hivie@autoanything.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I have some questions about the log retention and specifically what
> > > > gets deleted.
> > > >
> > > > I have a test app where I am writing 10 logs to the topic every
> second.
> > > >
> > > > What I would expect is a lag in a group would be somewhere around 10
> > > > if I have retention.ms at 1000.
> > > >
> > > > What I am seeing that the lag continues to grow, but then at some
> > > > point all messages are gone and the lag is at 0.
> > > >
> > > > I thought that the messages that are old would be deleted first.
> > > >
> > > > Am I misinterpreting how the log retention works?
> > > >
> > > > Heath Ivie
> > > > Solutions Architect
> > > >
> > > >
> > > > Warning: This e-mail may contain information proprietary to
> > > > AutoAnything Inc. and is intended only for the use of the intended
> > > > recipient(s). If the reader of this message is not the intended
> > > > recipient(s), you have received this message in error and any review,
> > > > dissemination, distribution or copying of this message is strictly
> > > > prohibited. If you have received this message in error, please notify
> > > > the sender immediately and delete all copies.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message