kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: [DISCUSS] KIP-116 - Add State Store Checkpoint Interval Configuration
Date Thu, 09 Feb 2017 10:03:33 GMT
I've ran the SimpleBenchmark with checkpoint on and off to see what the
impact is. It appears that there is very little impact, if any. The numbers
with checkpointing on actually look better, but that is likely largely due
to external influences.

In any case, i'm going to suggest we go with a default checkpoint interval
of 5 minutes. I've update the KIP with this.

commit every 10 seconds (no checkpoint)
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/34798/287372.83751939767/29.570664980746017
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/35942/278226.0308274442/28.62945857214401
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/34677/288375.58035585546/29.673847218617528
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/34677/288375.58035585546/29.673847218617528
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/31192/320595.02436522185/32.98922800718133


checkpoint every 10 seconds (same as commit interval)
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/36997/270292.185852907/27.81306592426413
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/32087/311652.69423754164/32.069062237043035
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/32895/303997.5680194558/31.281349749202004
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/33476/298721.4720994145/30.738439479029754
Streams Performance [records/latency/rec-sec/MB-sec source+store]:
10000000/33196/301241.1133871551/30.99771056753826

On Wed, 8 Feb 2017 at 09:02 Damian Guy <damian.guy@gmail.com> wrote:

> Matthias,
>
> Fair point. I'll update it the KIP.
> Thanks
>
> On Wed, 8 Feb 2017 at 05:49 Matthias J. Sax <matthias@confluent.io> wrote:
>
> Damian,
>
> I am not strict about it either. However, if there is no advantage in
> disabling it, we might not want to allow it. This would have the
> advantage to guard users to accidentally switch it off.
>
> -Matthias
>
>
> On 2/3/17 2:03 AM, Damian Guy wrote:
> > Hi Matthias,
> >
> > It possibly doesn't make sense to disable it, but then i'm sure someone
> > will come up with a reason they don't want it!
> > I'm happy to change it such that the checkpoint interval must be > 0.
> >
> > Cheers,
> > Damian
> >
> > On Fri, 3 Feb 2017 at 01:29 Matthias J. Sax <matthias@confluent.io>
> wrote:
> >
> >> Thanks Damian.
> >>
> >> One more question: "Checkpointing is disabled if the checkpoint interval
> >> is set to a value <=0."
> >>
> >>
> >> Does it make sense to disable check pointing? What's the tradeoff here?
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 2/2/17 1:51 AM, Damian Guy wrote:
> >>> Hi Matthias,
> >>>
> >>> Thanks for the comments.
> >>>
> >>> 1. TBD - i need to do some performance tests and try and work out a
> >>> sensible default.
> >>> 2. Yes, you are correct. It could be a multiple of the
> >> commit.interval.ms.
> >>> But, that would also mean if you change the commit interval - say you
> >> lower
> >>> it, then you might also need to change the checkpoint setting (i.e, you
> >>> still only want to checkpoint every n minutes).
> >>>
> >>> On Wed, 1 Feb 2017 at 23:46 Matthias J. Sax <matthias@confluent.io>
> >> wrote:
> >>>
> >>>> Thanks for the KIP Damian.
> >>>>
> >>>> I am wondering about two things:
> >>>>
> >>>> 1. what should be the default value for the new parameter?
> >>>> 2. why is the new parameter provided in ms?
> >>>>
> >>>> About (2): because
> >>>>
> >>>> "the minimum checkpoint interval will be the value of
> >>>> commit.interval.ms. In effect the actual checkpoint interval will be
> a
> >>>> multiple of the commit interval"
> >>>>
> >>>> it might be easier to just use an parameter that is "number-or-commit
> >>>> intervals".
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 2/1/17 7:29 AM, Damian Guy wrote:
> >>>>> Thanks for the comments Eno.
> >>>>> As for exactly once, i don't believe this matters as we are just
> >>>> restoring
> >>>>> the change-log, i.e, the result of the aggregations that previously
> ran
> >>>>> etc. So once initialized the state store will be in the same state
as
> >> it
> >>>>> was before.
> >>>>> Having the checkpoint in a kafka topic is not ideal as the state
is
> per
> >>>>> kafka streams instance. So each instance would need to start with
a
> >>>> unique
> >>>>> id that is persistent.
> >>>>>
> >>>>> Cheers,
> >>>>> Damian
> >>>>>
> >>>>> On Wed, 1 Feb 2017 at 13:20 Eno Thereska <eno.thereska@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> As a follow up to my previous comment, have you thought about
> writing
> >>>> the
> >>>>>> checkpoint to a topic instead of a local file? That would have
the
> >>>>>> advantage that all metadata continues to be managed by Kafka,
as
> well
> >> as
> >>>>>> fit with EoS. The potential disadvantage would be a slower latency,
> >>>> however
> >>>>>> if it is periodic as you mention, I'm not sure that would be
a show
> >>>> stopper.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Eno
> >>>>>>> On 1 Feb 2017, at 12:58, Eno Thereska <eno.thereska@gmail.com>
> >> wrote:
> >>>>>>>
> >>>>>>> Thanks Damian, this is a good idea and will reduce the restore
> time.
> >>>>>> Looking forward, with exactly once and support for transactions
in
> >>>> Kafka, I
> >>>>>> believe we'll have to add some support for rolling back checkpoints,
> >>>> e.g.,
> >>>>>> when a transaction is aborted. We need to be aware of that and
> ideally
> >>>>>> anticipate a bit those needs in the KIP.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Eno
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 1 Feb 2017, at 10:18, Damian Guy <damian.guy@gmail.com>
wrote:
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I would like to start the discussion on KIP-116:
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-116+-+Add+State+Store+Checkpoint+Interval+Configuration
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Damian
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message