kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Roesler <j...@confluent.io>
Subject Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics
Date Wed, 17 Jul 2019 18:16:30 GMT
Agreed. I think the names are actually not ambiguous once you recall
that the stats summarize measurements and each measurement is a
floating point number, but there's enough overlap that I also was
initially confused as well. I do plan to make this super clear in the
documentation.

Thanks,
-John

On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman <sophie@confluent.io> wrote:
>
> Sounds good to me
>
> By the way, while I agree that we can't really do better than Sum and Count
> I will say I also found the distinction a bit unclear at first glance. We
> should at least document clearly that "Sum" is a "sum of values" whereas
> "Count" is a "number of things" -- but that doesn't need to be part of the
> KIP
>
> On Wed, Jul 17, 2019 at 11:00 AM John Roesler <john@confluent.io> wrote:
>
> > Thanks for the replies.
> >
> > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or
> > something, it should still be pretty obvious that WindowedX is the
> > unweighted version? In that case, I buy the argument that we don't
> > need "Simple" and we can just go with:
> >
> > WindowedSum, WindowedCount
> > CumulativeSum, CumulativeCount
> >
> > Sound good?
> > Thanks,
> > -John
> >
> > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman
> > <sophie@confluent.io> wrote:
> > >
> > > Thanks for the crash course in statistical terms :)
> > >
> > > In light of this I definitely support Cumulative{Sum,Count}, but I'm
> > really
> > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so much
> > > because of its unfortunate length (although that is unfortunate it
> > > shouldn't be a deciding factor) but because it seems to have the
> > potential
> > > to confuse further. I'm not sure what we gain by adding "Simple" since to
> > > me at least, the unweighted-ness is obvious and the definition of simple
> > is
> > > not. To those who haven't been exposed to the finer details of
> > statistical
> > > definitions, I think they are more likely to read "SimpleXX" and wonder
> > "is
> > > there an 'advanced' or non-simple kind of Windowed?" than they are to
> > > wonder what is the weighting behind these metrics.
> > >
> > > Sophie
> > >
> > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler <john@confluent.io> wrote:
> > >
> > > > Thanks for the discussion, all.
> > > >
> > > > I've done a little more research into the statistical terminology.
> > > > Matthias is correct, "running" and "moving" appear to be synonyms.
> > > > Unfortunately, both can be computed either over a window of the last N
> > > > measurements or over all prior measurements. "Moving" just signifies
> > > > that the statistic is computed over a "live" data set, i.e., a
> > > > continuous stream of measurements, and the expectation is that the
> > > > stat would be updated in response to new measurements.
> > > >
> > > > I found https://en.wikipedia.org/wiki/Moving_average to have a pretty
> > > > good overview of the whole picture.
> > > >
> > > > After considering the discussion so far and some light reading, it
> > > > seems like "Cumulative" is truly the correct term for the all-time
> > > > metrics:
> > > >
> > > > > In a cumulative moving average, the data arrive in an
> > > > > ordered datum stream, and the user would like to get
> > > > > the average of all of the data up until the current datum
> > > > > point.
> > > >
> > > > I know that we previously felt that "cumulative" was too much of a
> > > > mouthful, but it seems like our quest for a terser term led us into a
> > > > briar patch. Also, now there is an independent source (the wiki page)
> > > > indicating that this is indeed the correct term, and it doesn't offer
> > > > any synonyms to choose from. Maybe we can take comfort in the fact
> > > > that we'll rarely be saying the name of the classes out loud.
> > > >
> > > > As far as moving stats that operate over a window of the last N
> > > > measurements, there are multiple options, including Simple
> > > > (unweighted), Weighted, and Exponentially Weighted, and presumably
> > > > infinite variations with other weighting functions. In our domain,
> > > > there is only one weighting function available, but it's still more
> > > > self-documenting and future-proof to specify the type of windowed
> > > > statistic. Therefore, I'm proposing "Simple" as the term for the
> > > > windowed (aka sampled) stats, while keeping Windowed in the name to
> > > > distinguish it from the all-time metrics.
> > > >
> > > > > In financial applications a simple moving average (SMA)
> > > > > is the unweighted mean of the previous n data.
> > > >
> > > > Therefore, we would have the proposed matrix:
> > > >
> > > > SimpleWindowedSum, SimpleWindowedCount
> > > > CumulativeSum, CumulativeCount
> > > >
> > > > Again, all these proposed names are less pithy than we might wish, but
> > > > the whole point of this exercise is to demystify and disambiguate
> > > > them. It seems like the discussion so far illustrates the futility of
> > > > trying to find names that are both short and descriptive.
> > > >
> > > > How does that sound?
> > > > -John
> > > >
> > > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax <matthias@confluent.io
> > >
> > > > wrote:
> > > > >
> > > > > It's a fair point that Ryanne raises. However, "running sum" is the
> > same
> > > > > as "moving sum" from my understanding.
> > > > >
> > > > > The issue is still, that `Sum` and `Count` which seem to be the
> > cleanest
> > > > > names cannot be used. While I agree that `TotalSum` and `TotalCount`
> > is
> > > > > somewhat redundant, I still think it the best suggestion so far.
> > > > >
> > > > > For the "sampled" version, I am personally fine with either
> > `MovingXxx`,
> > > > > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good
to
> > > > > describe the semantics.
> > > > >
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > >
> > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote:
> > > > > > I'm +1 on Windowed, was about to suggest that as I was catching
up
> > on
> > > > the
> > > > > > discussion but Bill beat me to it :)
> > > > > >
> > > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck <bbejeck@gmail.com>
> > wrote:
> > > > > >
> > > > > >> Hi John,
> > > > > >>
> > > > > >> Thanks for the updates.
> > > > > >>
> > > > > >> I like RunningCount and RunningSum.
> > > > > >>
> > > > > >> What about WindowedCount, WindowedSum instead of Moving?
> > > > > >> I'm just throwing this out there as Windowed seems more
intuitive
> > to
> > > > me,
> > > > > >> but I'm not married to the idea.
> > > > > >>
> > > > > >> -Bill
> > > > > >>
> > > > > >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler <john@confluent.io>
> > > > wrote:
> > > > > >>
> > > > > >>> No worries! Choosing good public API names is a high-impact
> > design
> > > > > >>> activity.
> > > > > >>>
> > > > > >>> Matthias, Bruno, Bill, and Stanislav,
> > > > > >>>
> > > > > >>> You've all contributed to this discussion or the vote
so far...
> > How
> > > > do
> > > > > >>> you feel about the proposed name change:
> > > > > >>>
> > > > > >>> MovingCount, MovingSum   (instead of Sampled)
> > > > > >>> RunningCount, RunningSum   (Instead of Total)
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> -John
> > > > > >>>
> > > > > >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > > > > >>> wrote:
> > > > > >>>>
> > > > > >>>> John, that makes sense to me. Sorry for the bikeshedding.
> > > > > >>>>
> > > > > >>>> Ryanne
> > > > > >>>>
> > > > > >>>> On Tue, Jul 16, 2019 at 12:49 PM John Roesler <
> > john@confluent.io>
> > > > > >> wrote:
> > > > > >>>>
> > > > > >>>>> Thanks for the explanation and the suggestion,
Ryanne,
> > > > > >>>>>
> > > > > >>>>> I went with "sampled" just because these are
instances of
> > > > > >> SampledStat,
> > > > > >>>>> which in the Kafka Metrics ecosystem are computed
from a
> > window of
> > > > > >>>>> recent samples. Thinking more about it, the
fact that they are
> > > > > >> sampled
> > > > > >>>>> and the fact that they are windowed are orthogonal,
which is
> > what
> > > > > >>>>> you're pointing out... sampling by itself doesn't
indicate that
> > > > it's
> > > > > >> a
> > > > > >>>>> moving average.
> > > > > >>>>>
> > > > > >>>>> Since there is no way in Kafka Metrics for a
metric to be
> > sampled
> > > > and
> > > > > >>>>> not windowed/moving/decaying, calling them Sampled
would never
> > be
> > > > > >>>>> incorrect. But to someone unfamiliar with the
code, it wouldn't
> > > > > >>>>> immediately suggest the behavior of the metric
that actually
> > > > matters.
> > > > > >>>>> That is, the behavior that distinguishes the
two classes of
> > metrics
> > > > > >> we
> > > > > >>>>> want to disambiguate here.
> > > > > >>>>>
> > > > > >>>>> It sounds like you'd suggest a new matrix of
names:
> > > > > >>>>> MovingCount, MovingSum
> > > > > >>>>> RunningCount, RunningSum
> > > > > >>>>>
> > > > > >>>>> Are these names unambiguous and self explanatory?
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>> -John
> > > > > >>>>>
> > > > > >>>>> On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan
<
> > > > ryannedolan@gmail.com
> > > > > >>>
> > > > > >>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> measurements, which decay/expire over
time
> > > > > >>>>>>
> > > > > >>>>>> Thanks John for the clarification. This
was my (re-)reading
> > of the
> > > > > >>> code,
> > > > > >>>>>> but this is not what I think of when I hear
"sampled". In
> > fact,
> > > > > >>> you'll
> > > > > >>>>>> notice that the Wikipedia pages for "Sample
(statistics)" and
> > > > > >> "Sample
> > > > > >>>>>> (signal processing)" do not contain the
words decay, expire,
> > > > > >> recent,
> > > > > >>>>>> history, or anything similar.
> > > > > >>>>>>
> > > > > >>>>>> Similar to "running", I'd suggest the more
correct "moving",
> > as in
> > > > > >>>>> "moving
> > > > > >>>>>> average" and "moving sum", which involve
looking back N
> > samples,
> > > > > >>>>> applying a
> > > > > >>>>>> sliding window, decaying over time etc.
> > > > > >>>>>>
> > > > > >>>>>> Ryanne
> > > > > >>>>>>
> > > > > >>>>>> On Tue, Jul 16, 2019, 11:58 AM John Roesler
<
> > john@confluent.io>
> > > > > >>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Thanks for raising this concern, Ryanne,
> > > > > >>>>>>>
> > > > > >>>>>>> "Sampled" indicates that the metrics
is sampled, namely that
> > we
> > > > > >>>>>>> maintain a set of samples from recent
value measurements,
> > which
> > > > > >>>>>>> decay/expire over time. So, the metric
value is only
> > > > > >>> representative of
> > > > > >>>>>>> the recent past.
> > > > > >>>>>>>
> > > > > >>>>>>> "Total" indicates that the metric value
contains all the
> > > > > >>> information
> > > > > >>>>>>> from the creation of the metric. For
example., the total sum
> > > > > >> would
> > > > > >>>>>>> include all measurements since the app
started up.
> > > > > >>>>>>>
> > > > > >>>>>>> It seems like your concern is that the
word "total" doesn't
> > > > > >> really
> > > > > >>>>>>> pinpoint this meaning, which is true.
It's especially
> > confusing
> > > > > >>> that
> > > > > >>>>>>> another meaning of "total" is synonymous
with "sum",
> > rendering
> > > > > >> the
> > > > > >>>>>>> name "TotalSum" sort of absurd.
> > > > > >>>>>>>
> > > > > >>>>>>> We previously considered "cumulative",
which was rejected as
> > a
> > > > > >>>>>>> mouthful (it's four syllables) .
> > > > > >>>>>>>
> > > > > >>>>>>> You mentioned "running", which might
be a more appropriate
> > > > > >> modifier
> > > > > >>>>>>> (RunningSum and RunningCount).
> > > > > >>>>>>>
> > > > > >>>>>>> What would everyone think about that?
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks,
> > > > > >>>>>>> -John
> > > > > >>>>>>>
> > > > > >>>>>>> On Tue, Jul 16, 2019 at 11:27 AM Ryanne
Dolan <
> > > > > >>> ryannedolan@gmail.com>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>> John, I mentioned on the VOTE thread
that the proposed names
> > > > > >> are
> > > > > >>> a
> > > > > >>>>> bit
> > > > > >>>>>>>> confusing,
> > > > > >>>>>>>>
> > > > > >>>>>>>>> given that "sum", "total", and
"count" are roughly
> > > > > >>> synonymous...
> > > > > >>>>>>>>
> > > > > >>>>>>>> In particular, TotalSum is, I think,
a "running total",
> > though
> > > > > >>> the
> > > > > >>>>> naming
> > > > > >>>>>>>> doesn't really capture that.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I think to avoid confusion, we should
define exactly what
> > > > > >>> "total" and
> > > > > >>>>>>>> "sampled" are supposed to indicate,
and perhaps pick
> > > > > >> appropriate
> > > > > >>>>> naming
> > > > > >>>>>>>> from there.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Ryanne
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Fri, Jul 12, 2019 at 1:42 PM
John Roesler <
> > > > > >> john@confluent.io>
> > > > > >>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> Hey, thanks Matthias and Bruno,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> I agree, "Cumulative" is a mouthful.
"TotalX" sounds fine
> > to
> > > > > >>> me.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Also, yes, I would have liked
to not have any modifier for
> > > > > >>>>>>>>> "non-sampled", but there is
a name conflict with Sum.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> I'll update the KIP to reflect
"TotalX" and then start the
> > > > > >> vote
> > > > > >>>>> thread.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks again,
> > > > > >>>>>>>>> -John
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Fri, Jul 12, 2019 at 11:27
AM Bruno Cadonna <
> > > > > >>> bruno@confluent.io
> > > > > >>>>>>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> OK, makes sense. Then, I
am in favour of TotalCount and
> > > > > >>> TotalSum.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Best,
> > > > > >>>>>>>>>> Bruno
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On Fri, Jul 12, 2019 at
12:57 AM Matthias J. Sax <
> > > > > >>>>>>> matthias@confluent.io>
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> `Sum` is an existing
name, for the "sampled sum" metric,
> > > > > >>> that
> > > > > >>>>> gets
> > > > > >>>>>>>>>>> deprecated. Hence, we
cannot use it.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> If we cannot use `Sum`
and use `TotalSum`, we should also
> > > > > >>> not
> > > > > >>>>> use
> > > > > >>>>>>>>>>> `Count` but `TotalCount`
for consistency.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> -Matthias
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On 7/11/19 12:58 PM,
Bruno Cadonna wrote:
> > > > > >>>>>>>>>>>> Hi John,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thank you for the
KIP.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> LGTM
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I also do not like
CumulativeSum/Count so much. I
> > > > > >>> propose to
> > > > > >>>>> just
> > > > > >>>>>>>>> call
> > > > > >>>>>>>>>>>> it Sum and Count.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I understand that
you want to unequivocally distinguish
> > > > > >>> the
> > > > > >>>>> two
> > > > > >>>>>>>>> metric
> > > > > >>>>>>>>>>>> functions by their
names, but I have the feeling the
> > > > > >>> names
> > > > > >>>>> become
> > > > > >>>>>>>>>>>> artificially complex.
The exact semantics can also be
> > > > > >>>>> documented
> > > > > >>>>>>> in
> > > > > >>>>>>>>>>>> the javadocs, which
btw could also be improved in those
> > > > > >>>>> classes.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Best,
> > > > > >>>>>>>>>>>> Bruno
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Thu, Jul 11,
2019 at 8:25 PM Matthias J. Sax <
> > > > > >>>>>>>>> matthias@confluent.io> wrote:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Thanks for the
KIP. Overall LGTM.
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> The only though
I have is, if we may want to use
> > > > > >>> `TotalSum`
> > > > > >>>>> and
> > > > > >>>>>>>>>>>>> `TotalCount`
instead of `CumulativeSum/Count` as
> > > > > >> names?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> -Matthias
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> On 7/11/19 9:31
AM, John Roesler wrote:
> > > > > >>>>>>>>>>>>>> Hi Kafka
devs,
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I'd like
to propose KIP-488 as a minor cleanup of
> > > > > >> some
> > > > > >>> of
> > > > > >>>>> our
> > > > > >>>>>>>>> metric
> > > > > >>>>>>>>>>>>>> implementations.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> KIP-488:
> > > > > >> https://cwiki.apache.org/confluence/x/kkAyBw
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Over time,
iterative updates to these metrics has
> > > > > >>> resulted
> > > > > >>>>> in a
> > > > > >>>>>>>>> pretty
> > > > > >>>>>>>>>>>>>> confusing
little collection of classes, and I've
> > > > > >>> personally
> > > > > >>>>>>> been
> > > > > >>>>>>>>>>>>>> involved
in three separate moderately time-consuming
> > > > > >>>>>>> iterations of
> > > > > >>>>>>>>> me
> > > > > >>>>>>>>>>>>>> or someone
else trying to work out which metrics are
> > > > > >>>>>>> available, and
> > > > > >>>>>>>>>>>>>> which ones
are desired for a given use case. One of
> > > > > >>> these
> > > > > >>>>> was
> > > > > >>>>>>>>> actually
> > > > > >>>>>>>>>>>>>> a long-running
bug in Kafka Streams' metrics, so not
> > > > > >>> only
> > > > > >>>>> has
> > > > > >>>>>>> this
> > > > > >>>>>>>>>>>>>> confusion
been a time sink, but it has also led to
> > > > > >>> bugs.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I'm hoping
this change won't be too controversial.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>>>> -John
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> >

Mime
View raw message