From dev-return-105834-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Wed Jul 17 19:57:46 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 487C918060E for ; Wed, 17 Jul 2019 21:57:46 +0200 (CEST) Received: (qmail 81896 invoked by uid 500); 17 Jul 2019 19:57:44 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 81872 invoked by uid 99); 17 Jul 2019 19:57:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2019 19:57:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D4D211A334F for ; Wed, 17 Jul 2019 19:57:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.199 X-Spam-Level: X-Spam-Status: No, score=-0.199 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=confluent.io Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id E9RLjNLgN36Q for ; Wed, 17 Jul 2019 19:57:40 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::431; helo=mail-wr1-x431.google.com; envelope-from=john@confluent.io; receiver= Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 2EB537DC5D for ; Wed, 17 Jul 2019 19:57:40 +0000 (UTC) Received: by mail-wr1-x431.google.com with SMTP id f9so26089930wre.12 for ; Wed, 17 Jul 2019 12:57:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=confluent.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=45q1FC6tyW3iYFNPyQYSkt2LWZaKfcdk0sdgRkMSX2E=; b=jqxP5deP9JCWsP9NZic5D9EQPyX7fa5yCsOKZi/znK/u/Lj9q2+M24i9rM5Nx09Fz6 erO2uBjE7Tr2t2QUKFqO/KHIIWYgof6U5mvZna5/0lWQwxNWTkHmg/eKKj/ipIskk8d0 UqVLF7STsiZnE9PuCa9cJf5FTPeDH9ZQPqlMEkB6dNj7CA5Aftz617O16lvrRPCjfAKo iBw1xuKir0dikY507FzSSnilOJS/SBDIEIqfrd0RqF+r2r66cD8JEBB37QxihmQWEdUD jFuja3yNjzOxgJzA0RujNJMFONixGGa0Q0dJ9OcOvWdexU0FUXdot3S3AFq3KcWsTw8l ARbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=45q1FC6tyW3iYFNPyQYSkt2LWZaKfcdk0sdgRkMSX2E=; b=YI14atBLUJZtGZ+417fBxF+KSnt7BT8Zkm8/TjsCjUF8Q88CfVaApWc/fYR39XANOG EkBu2vEjXIAXeglyQYfIYOS4Dmv6byHa1bez85JoFEVAvQ1ebLyHGhzMP8oefbnnG3RC BQ1OliRw4wh4hYIdJRDqzrqhnImFJy2JJ74vW+HjNQSMMBm07uLTgDKouIRVQBIKUv/Q X+EHorT2Gw/HIMXNotZapAkWAB3gemyMVcUl5X0KqTLNVkythydpdZynEOhkYYAgMrjM /pvM5Oxd5giFbmpDqep5kOsK2J7uhmfDXzjclreGqkyF4H9Xj/iZB7XNPC7IWYjsDaoG 1KTQ== X-Gm-Message-State: APjAAAUtB80MRhcTDYh2IVO9v30R6+Piu3WqAvm2ew7aWtXqhZYOxoAN vi2tkV0ciBm65IO8xA/r8NEDGJqwKtgJDMIgcyriDB9d X-Google-Smtp-Source: APXvYqyfPQoenoTyXnRrj7xwxj7fQ13QaxQlvSKNphKdlqnmrgGHxkQxI8Af8U75X6vXuGX+3+EBQocpB4EMgb0+qNI= X-Received: by 2002:adf:dd51:: with SMTP id u17mr42712036wrm.218.1563393453393; Wed, 17 Jul 2019 12:57:33 -0700 (PDT) MIME-Version: 1.0 References: <6774e884-7165-0d20-b537-43ddf437657f@confluent.io> In-Reply-To: From: John Roesler Date: Wed, 17 Jul 2019 14:57:22 -0500 Message-ID: Subject: Re: [DISCUSS] KIP-488: Clean up Sum,Count,Total Metrics To: dev@kafka.apache.org Content-Type: text/plain; charset="UTF-8" Cool. I'll update the KIP, then, and we can re-start the vote. (looks like you've already cast a vote :) ) -John On Wed, Jul 17, 2019 at 1:23 PM Ryanne Dolan wrote: > > John, makes sense to me! Thanks. > > Ryanne > > On Wed, Jul 17, 2019, 1:16 PM John Roesler wrote: > > > Agreed. I think the names are actually not ambiguous once you recall > > that the stats summarize measurements and each measurement is a > > floating point number, but there's enough overlap that I also was > > initially confused as well. I do plan to make this super clear in the > > documentation. > > > > Thanks, > > -John > > > > On Wed, Jul 17, 2019 at 1:08 PM Sophie Blee-Goldman > > wrote: > > > > > > Sounds good to me > > > > > > By the way, while I agree that we can't really do better than Sum and > > Count > > > I will say I also found the distinction a bit unclear at first glance. We > > > should at least document clearly that "Sum" is a "sum of values" whereas > > > "Count" is a "number of things" -- but that doesn't need to be part of > > the > > > KIP > > > > > > On Wed, Jul 17, 2019 at 11:00 AM John Roesler wrote: > > > > > > > Thanks for the replies. > > > > > > > > I guess that if we did add (e.g.) ExponentiallyWeightedWindowedX or > > > > something, it should still be pretty obvious that WindowedX is the > > > > unweighted version? In that case, I buy the argument that we don't > > > > need "Simple" and we can just go with: > > > > > > > > WindowedSum, WindowedCount > > > > CumulativeSum, CumulativeCount > > > > > > > > Sound good? > > > > Thanks, > > > > -John > > > > > > > > On Wed, Jul 17, 2019 at 11:53 AM Sophie Blee-Goldman > > > > wrote: > > > > > > > > > > Thanks for the crash course in statistical terms :) > > > > > > > > > > In light of this I definitely support Cumulative{Sum,Count}, but I'm > > > > really > > > > > not crazy about SimpleWindowed{Sum,Count} (vs just Windowed). Not so > > much > > > > > because of its unfortunate length (although that is unfortunate it > > > > > shouldn't be a deciding factor) but because it seems to have the > > > > potential > > > > > to confuse further. I'm not sure what we gain by adding "Simple" > > since to > > > > > me at least, the unweighted-ness is obvious and the definition of > > simple > > > > is > > > > > not. To those who haven't been exposed to the finer details of > > > > statistical > > > > > definitions, I think they are more likely to read "SimpleXX" and > > wonder > > > > "is > > > > > there an 'advanced' or non-simple kind of Windowed?" than they are to > > > > > wonder what is the weighting behind these metrics. > > > > > > > > > > Sophie > > > > > > > > > > On Wed, Jul 17, 2019 at 8:18 AM John Roesler > > wrote: > > > > > > > > > > > Thanks for the discussion, all. > > > > > > > > > > > > I've done a little more research into the statistical terminology. > > > > > > Matthias is correct, "running" and "moving" appear to be synonyms. > > > > > > Unfortunately, both can be computed either over a window of the > > last N > > > > > > measurements or over all prior measurements. "Moving" just > > signifies > > > > > > that the statistic is computed over a "live" data set, i.e., a > > > > > > continuous stream of measurements, and the expectation is that the > > > > > > stat would be updated in response to new measurements. > > > > > > > > > > > > I found https://en.wikipedia.org/wiki/Moving_average to have a > > pretty > > > > > > good overview of the whole picture. > > > > > > > > > > > > After considering the discussion so far and some light reading, it > > > > > > seems like "Cumulative" is truly the correct term for the all-time > > > > > > metrics: > > > > > > > > > > > > > In a cumulative moving average, the data arrive in an > > > > > > > ordered datum stream, and the user would like to get > > > > > > > the average of all of the data up until the current datum > > > > > > > point. > > > > > > > > > > > > I know that we previously felt that "cumulative" was too much of a > > > > > > mouthful, but it seems like our quest for a terser term led us > > into a > > > > > > briar patch. Also, now there is an independent source (the wiki > > page) > > > > > > indicating that this is indeed the correct term, and it doesn't > > offer > > > > > > any synonyms to choose from. Maybe we can take comfort in the fact > > > > > > that we'll rarely be saying the name of the classes out loud. > > > > > > > > > > > > As far as moving stats that operate over a window of the last N > > > > > > measurements, there are multiple options, including Simple > > > > > > (unweighted), Weighted, and Exponentially Weighted, and presumably > > > > > > infinite variations with other weighting functions. In our domain, > > > > > > there is only one weighting function available, but it's still more > > > > > > self-documenting and future-proof to specify the type of windowed > > > > > > statistic. Therefore, I'm proposing "Simple" as the term for the > > > > > > windowed (aka sampled) stats, while keeping Windowed in the name to > > > > > > distinguish it from the all-time metrics. > > > > > > > > > > > > > In financial applications a simple moving average (SMA) > > > > > > > is the unweighted mean of the previous n data. > > > > > > > > > > > > Therefore, we would have the proposed matrix: > > > > > > > > > > > > SimpleWindowedSum, SimpleWindowedCount > > > > > > CumulativeSum, CumulativeCount > > > > > > > > > > > > Again, all these proposed names are less pithy than we might wish, > > but > > > > > > the whole point of this exercise is to demystify and disambiguate > > > > > > them. It seems like the discussion so far illustrates the futility > > of > > > > > > trying to find names that are both short and descriptive. > > > > > > > > > > > > How does that sound? > > > > > > -John > > > > > > > > > > > > On Tue, Jul 16, 2019 at 4:43 PM Matthias J. Sax < > > matthias@confluent.io > > > > > > > > > > > wrote: > > > > > > > > > > > > > > It's a fair point that Ryanne raises. However, "running sum" is > > the > > > > same > > > > > > > as "moving sum" from my understanding. > > > > > > > > > > > > > > The issue is still, that `Sum` and `Count` which seem to be the > > > > cleanest > > > > > > > names cannot be used. While I agree that `TotalSum` and > > `TotalCount` > > > > is > > > > > > > somewhat redundant, I still think it the best suggestion so far. > > > > > > > > > > > > > > For the "sampled" version, I am personally fine with either > > > > `MovingXxx`, > > > > > > > `WindowedXxx`, or `RunningXxx` -- to me, that are all equally > > good to > > > > > > > describe the semantics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > > > > > > > > I'm +1 on Windowed, was about to suggest that as I was > > catching up > > > > on > > > > > > the > > > > > > > > discussion but Bill beat me to it :) > > > > > > > > > > > > > > > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck > > > > > > wrote: > > > > > > > > > > > > > > > >> Hi John, > > > > > > > >> > > > > > > > >> Thanks for the updates. > > > > > > > >> > > > > > > > >> I like RunningCount and RunningSum. > > > > > > > >> > > > > > > > >> What about WindowedCount, WindowedSum instead of Moving? > > > > > > > >> I'm just throwing this out there as Windowed seems more > > intuitive > > > > to > > > > > > me, > > > > > > > >> but I'm not married to the idea. > > > > > > > >> > > > > > > > >> -Bill > > > > > > > >> > > > > > > > >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler < > > john@confluent.io> > > > > > > wrote: > > > > > > > >> > > > > > > > >>> No worries! Choosing good public API names is a high-impact > > > > design > > > > > > > >>> activity. > > > > > > > >>> > > > > > > > >>> Matthias, Bruno, Bill, and Stanislav, > > > > > > > >>> > > > > > > > >>> You've all contributed to this discussion or the vote so > > far... > > > > How > > > > > > do > > > > > > > >>> you feel about the proposed name change: > > > > > > > >>> > > > > > > > >>> MovingCount, MovingSum (instead of Sampled) > > > > > > > >>> RunningCount, RunningSum (Instead of Total) > > > > > > > >>> > > > > > > > >>> Thanks, > > > > > > > >>> -John > > > > > > > >>> > > > > > > > >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan < > > > > ryannedolan@gmail.com> > > > > > > > >>> wrote: > > > > > > > >>>> > > > > > > > >>>> John, that makes sense to me. Sorry for the bikeshedding. > > > > > > > >>>> > > > > > > > >>>> Ryanne > > > > > > > >>>> > > > > > > > >>>> On Tue, Jul 16, 2019 at 12:49 PM John Roesler < > > > > john@confluent.io> > > > > > > > >> wrote: > > > > > > > >>>> > > > > > > > >>>>> Thanks for the explanation and the suggestion, Ryanne, > > > > > > > >>>>> > > > > > > > >>>>> I went with "sampled" just because these are instances of > > > > > > > >> SampledStat, > > > > > > > >>>>> which in the Kafka Metrics ecosystem are computed from a > > > > window of > > > > > > > >>>>> recent samples. Thinking more about it, the fact that they > > are > > > > > > > >> sampled > > > > > > > >>>>> and the fact that they are windowed are orthogonal, which > > is > > > > what > > > > > > > >>>>> you're pointing out... sampling by itself doesn't indicate > > that > > > > > > it's > > > > > > > >> a > > > > > > > >>>>> moving average. > > > > > > > >>>>> > > > > > > > >>>>> Since there is no way in Kafka Metrics for a metric to be > > > > sampled > > > > > > and > > > > > > > >>>>> not windowed/moving/decaying, calling them Sampled would > > never > > > > be > > > > > > > >>>>> incorrect. But to someone unfamiliar with the code, it > > wouldn't > > > > > > > >>>>> immediately suggest the behavior of the metric that > > actually > > > > > > matters. > > > > > > > >>>>> That is, the behavior that distinguishes the two classes of > > > > metrics > > > > > > > >> we > > > > > > > >>>>> want to disambiguate here. > > > > > > > >>>>> > > > > > > > >>>>> It sounds like you'd suggest a new matrix of names: > > > > > > > >>>>> MovingCount, MovingSum > > > > > > > >>>>> RunningCount, RunningSum > > > > > > > >>>>> > > > > > > > >>>>> Are these names unambiguous and self explanatory? > > > > > > > >>>>> > > > > > > > >>>>> Thanks, > > > > > > > >>>>> -John > > > > > > > >>>>> > > > > > > > >>>>> On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan < > > > > > > ryannedolan@gmail.com > > > > > > > >>> > > > > > > > >>>>> wrote: > > > > > > > >>>>>> > > > > > > > >>>>>>> measurements, which decay/expire over time > > > > > > > >>>>>> > > > > > > > >>>>>> Thanks John for the clarification. This was my > > (re-)reading > > > > of the > > > > > > > >>> code, > > > > > > > >>>>>> but this is not what I think of when I hear "sampled". In > > > > fact, > > > > > > > >>> you'll > > > > > > > >>>>>> notice that the Wikipedia pages for "Sample (statistics)" > > and > > > > > > > >> "Sample > > > > > > > >>>>>> (signal processing)" do not contain the words decay, > > expire, > > > > > > > >> recent, > > > > > > > >>>>>> history, or anything similar. > > > > > > > >>>>>> > > > > > > > >>>>>> Similar to "running", I'd suggest the more correct > > "moving", > > > > as in > > > > > > > >>>>> "moving > > > > > > > >>>>>> average" and "moving sum", which involve looking back N > > > > samples, > > > > > > > >>>>> applying a > > > > > > > >>>>>> sliding window, decaying over time etc. > > > > > > > >>>>>> > > > > > > > >>>>>> Ryanne > > > > > > > >>>>>> > > > > > > > >>>>>> On Tue, Jul 16, 2019, 11:58 AM John Roesler < > > > > john@confluent.io> > > > > > > > >>> wrote: > > > > > > > >>>>>> > > > > > > > >>>>>>> Thanks for raising this concern, Ryanne, > > > > > > > >>>>>>> > > > > > > > >>>>>>> "Sampled" indicates that the metrics is sampled, namely > > that > > > > we > > > > > > > >>>>>>> maintain a set of samples from recent value measurements, > > > > which > > > > > > > >>>>>>> decay/expire over time. So, the metric value is only > > > > > > > >>> representative of > > > > > > > >>>>>>> the recent past. > > > > > > > >>>>>>> > > > > > > > >>>>>>> "Total" indicates that the metric value contains all the > > > > > > > >>> information > > > > > > > >>>>>>> from the creation of the metric. For example., the total > > sum > > > > > > > >> would > > > > > > > >>>>>>> include all measurements since the app started up. > > > > > > > >>>>>>> > > > > > > > >>>>>>> It seems like your concern is that the word "total" > > doesn't > > > > > > > >> really > > > > > > > >>>>>>> pinpoint this meaning, which is true. It's especially > > > > confusing > > > > > > > >>> that > > > > > > > >>>>>>> another meaning of "total" is synonymous with "sum", > > > > rendering > > > > > > > >> the > > > > > > > >>>>>>> name "TotalSum" sort of absurd. > > > > > > > >>>>>>> > > > > > > > >>>>>>> We previously considered "cumulative", which was > > rejected as > > > > a > > > > > > > >>>>>>> mouthful (it's four syllables) . > > > > > > > >>>>>>> > > > > > > > >>>>>>> You mentioned "running", which might be a more > > appropriate > > > > > > > >> modifier > > > > > > > >>>>>>> (RunningSum and RunningCount). > > > > > > > >>>>>>> > > > > > > > >>>>>>> What would everyone think about that? > > > > > > > >>>>>>> > > > > > > > >>>>>>> Thanks, > > > > > > > >>>>>>> -John > > > > > > > >>>>>>> > > > > > > > >>>>>>> On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan < > > > > > > > >>> ryannedolan@gmail.com> > > > > > > > >>>>>>> wrote: > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> John, I mentioned on the VOTE thread that the proposed > > names > > > > > > > >> are > > > > > > > >>> a > > > > > > > >>>>> bit > > > > > > > >>>>>>>> confusing, > > > > > > > >>>>>>>> > > > > > > > >>>>>>>>> given that "sum", "total", and "count" are roughly > > > > > > > >>> synonymous... > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> In particular, TotalSum is, I think, a "running total", > > > > though > > > > > > > >>> the > > > > > > > >>>>> naming > > > > > > > >>>>>>>> doesn't really capture that. > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> I think to avoid confusion, we should define exactly > > what > > > > > > > >>> "total" and > > > > > > > >>>>>>>> "sampled" are supposed to indicate, and perhaps pick > > > > > > > >> appropriate > > > > > > > >>>>> naming > > > > > > > >>>>>>>> from there. > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> Ryanne > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> On Fri, Jul 12, 2019 at 1:42 PM John Roesler < > > > > > > > >> john@confluent.io> > > > > > > > >>>>> wrote: > > > > > > > >>>>>>>> > > > > > > > >>>>>>>>> Hey, thanks Matthias and Bruno, > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> I agree, "Cumulative" is a mouthful. "TotalX" sounds > > fine > > > > to > > > > > > > >>> me. > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> Also, yes, I would have liked to not have any modifier > > for > > > > > > > >>>>>>>>> "non-sampled", but there is a name conflict with Sum. > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> I'll update the KIP to reflect "TotalX" and then start > > the > > > > > > > >> vote > > > > > > > >>>>> thread. > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> Thanks again, > > > > > > > >>>>>>>>> -John > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna < > > > > > > > >>> bruno@confluent.io > > > > > > > >>>>>> > > > > > > > >>>>>>> wrote: > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> OK, makes sense. Then, I am in favour of TotalCount > > and > > > > > > > >>> TotalSum. > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> Best, > > > > > > > >>>>>>>>>> Bruno > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > > > > > > > >>>>>>> matthias@confluent.io> > > > > > > > >>>>>>>>> wrote: > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> `Sum` is an existing name, for the "sampled sum" > > metric, > > > > > > > >>> that > > > > > > > >>>>> gets > > > > > > > >>>>>>>>>>> deprecated. Hence, we cannot use it. > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> If we cannot use `Sum` and use `TotalSum`, we should > > also > > > > > > > >>> not > > > > > > > >>>>> use > > > > > > > >>>>>>>>>>> `Count` but `TotalCount` for consistency. > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> -Matthias > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > > > >>>>>>>>>>>> Hi John, > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> Thank you for the KIP. > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> LGTM > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> I also do not like CumulativeSum/Count so much. I > > > > > > > >>> propose to > > > > > > > >>>>> just > > > > > > > >>>>>>>>> call > > > > > > > >>>>>>>>>>>> it Sum and Count. > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> I understand that you want to unequivocally > > distinguish > > > > > > > >>> the > > > > > > > >>>>> two > > > > > > > >>>>>>>>> metric > > > > > > > >>>>>>>>>>>> functions by their names, but I have the feeling the > > > > > > > >>> names > > > > > > > >>>>> become > > > > > > > >>>>>>>>>>>> artificially complex. The exact semantics can also > > be > > > > > > > >>>>> documented > > > > > > > >>>>>>> in > > > > > > > >>>>>>>>>>>> the javadocs, which btw could also be improved in > > those > > > > > > > >>>>> classes. > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> Best, > > > > > > > >>>>>>>>>>>> Bruno > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > > > > > > > >>>>>>>>> matthias@confluent.io> wrote: > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> Thanks for the KIP. Overall LGTM. > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> The only though I have is, if we may want to use > > > > > > > >>> `TotalSum` > > > > > > > >>>>> and > > > > > > > >>>>>>>>>>>>> `TotalCount` instead of `CumulativeSum/Count` as > > > > > > > >> names? > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> -Matthias > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> On 7/11/19 9:31 AM, John Roesler wrote: > > > > > > > >>>>>>>>>>>>>> Hi Kafka devs, > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>>> I'd like to propose KIP-488 as a minor cleanup of > > > > > > > >> some > > > > > > > >>> of > > > > > > > >>>>> our > > > > > > > >>>>>>>>> metric > > > > > > > >>>>>>>>>>>>>> implementations. > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>>> KIP-488: > > > > > > > >> https://cwiki.apache.org/confluence/x/kkAyBw > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>>> Over time, iterative updates to these metrics has > > > > > > > >>> resulted > > > > > > > >>>>> in a > > > > > > > >>>>>>>>> pretty > > > > > > > >>>>>>>>>>>>>> confusing little collection of classes, and I've > > > > > > > >>> personally > > > > > > > >>>>>>> been > > > > > > > >>>>>>>>>>>>>> involved in three separate moderately > > time-consuming > > > > > > > >>>>>>> iterations of > > > > > > > >>>>>>>>> me > > > > > > > >>>>>>>>>>>>>> or someone else trying to work out which metrics > > are > > > > > > > >>>>>>> available, and > > > > > > > >>>>>>>>>>>>>> which ones are desired for a given use case. One > > of > > > > > > > >>> these > > > > > > > >>>>> was > > > > > > > >>>>>>>>> actually > > > > > > > >>>>>>>>>>>>>> a long-running bug in Kafka Streams' metrics, so > > not > > > > > > > >>> only > > > > > > > >>>>> has > > > > > > > >>>>>>> this > > > > > > > >>>>>>>>>>>>>> confusion been a time sink, but it has also led to > > > > > > > >>> bugs. > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>>> I'm hoping this change won't be too controversial. > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>>> Thanks, > > > > > > > >>>>>>>>>>>>>> -John > > > > > > > >>>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>> > > > > > > > >>>>>>> > > > > > > > >>>>> > > > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > >