kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyrus Vafadari <cy...@confluent.io>
Subject Re: [VOTE] KIP-475: New Metric to Measure Number of Tasks on a Connector
Date Wed, 03 Jul 2019 16:33:54 GMT
Thanks for the input, everyone! I'm marking this KIP as accepted with
3 binding votes (Gwen, Guozhang, Randall)
3 nonbinding votes (Andrew, Konstantine, Ryanne)

The PR is updated at https://github.com/apache/kafka/pull/6843/files


On Sun, Jun 23, 2019 at 10:52 AM Andrew Schofield <andrew_schofield@live.com>
wrote:

> +1 (non-binding). Nice KIP.
>
> ´╗┐On 23/06/2019, 17:27, "Gwen Shapira" <gwen@confluent.io> wrote:
>
>     +1 from me too
>
>     On Tue, Jun 18, 2019, 10:30 AM Guozhang Wang <wangguoz@gmail.com>
> wrote:
>
>     > Cyrus, thanks for the updates, +1.
>     >
>     > On Mon, Jun 17, 2019 at 10:58 PM Cyrus Vafadari <cyrus@confluent.io>
>     > wrote:
>     >
>     > > Many thanks for the feedback. Per Gwen's suggestion, I've updated
> the KIP
>     > > to specify that the task count will be per-worker (no additional
> MBean
>     > tag,
>     > > since each process is a worker) and per-connector (MBean tag).
>     > >
>     > > On Mon, Jun 17, 2019 at 8:24 PM Cyrus Vafadari <cyrus@confluent.io
> >
>     > wrote:
>     > >
>     > > > I meant to write:
>     > > > I've also updated the KIP to clarify that every task must have
> exactly
>     > > one
>     > > > non-null *status* at all times.
>     > > >
>     > > > On Mon, Jun 17, 2019 at 6:55 PM Cyrus Vafadari <
> cyrus@confluent.io>
>     > > wrote:
>     > > >
>     > > >> Guozhang,
>     > > >>
>     > > >> Both of Kafka's implementations of "StatusBackingStore"
> immediately
>     > > >> delete the task from the backign store when you try to set it
to
>     > > DESTROYED,
>     > > >> so we'd actually expect it to always be zero. A nonzero number
> of
>     > > destroyed
>     > > >> tasks would either indicate a new implementation of
>     > StatusBackingStore,
>     > > or
>     > > >> a malfunctioning StatusBackingStore (e.g. caches out of sync
> with
>     > > compacted
>     > > >> topic). This metric will usually be uninteresting, and was only
>     > included
>     > > >> for completeness. It could possibly catch a bug.
>     > > >>
>     > > >> Gwen,
>     > > >> I had not considered this option. I agree there is an advantage
> to
>     > > having
>     > > >> more granular data about both connector and worker. The main
>     > > disadvantage
>     > > >> would be that it increases the number of metrics by a factor of
>     > > >> num_workers, but I'd say this is an acceptable tradeoff. Another
>     > > advantage
>     > > >> of your suggestion is that the public interfaces for
> WorkerConnector
>     > > would
>     > > >> be unchanged, and the new metrics can be added within the Worker
>     > class.
>     > > >>
>     > > >> I've also updated the KIP to clarify that every task must have
> exactly
>     > > >> one non-null task at all times.
>     > > >>
>     > > >> On Mon, Jun 17, 2019 at 1:41 PM Guozhang Wang <
> wangguoz@gmail.com>
>     > > wrote:
>     > > >>
>     > > >>> Hello Cyrus,
>     > > >>>
>     > > >>> Thanks for the KIP. I just have one nit question about Connect
>     > > destroyed
>     > > >>> tasks: is it an ever-increasing number? If yes, the
> corresponding
>     > > metric
>     > > >>> value would be increasing indefinitely as well. Is that
> intentional?
>     > > >>>
>     > > >>> Otherwise, lgtm.
>     > > >>>
>     > > >>>
>     > > >>> Guozhang
>     > > >>>
>     > > >>> On Mon, Jun 17, 2019 at 1:14 PM Gwen Shapira <
> gwen@confluent.io>
>     > > wrote:
>     > > >>>
>     > > >>> > Sorry to join so late, but did we consider a single set
of
>     > task-count
>     > > >>> > metrics and using tags to scope each data point to a
specific
>     > > >>> > connector and worker (and in the future perhaps also
user)?
>     > > >>> >
>     > > >>> > It will make analysis of the data easier - someone may
want
> to
>     > > >>> > breakdown tasks by both worker and connector to detect
> imbalanced
>     > > >>> > assignments.
>     > > >>> >
>     > > >>> > Are there downsides to this approach?
>     > > >>> >
>     > > >>> > And a small nit: it will be good to capture in the KIP
what
> are the
>     > > >>> > expectations regarding overlap and disjointness of the
> proposed
>     > > >>> > metrics. For example, is running+paused+failed = total?
Can
> a task
>     > be
>     > > >>> > failed and destroyed and therefore count in 2 of those
> metrics?
>     > > >>> >
>     > > >>> > On Thu, Jun 6, 2019 at 12:29 PM Cyrus Vafadari <
> cyrus@confluent.io
>     > >
>     > > >>> wrote:
>     > > >>> > >
>     > > >>> > > Konstantine,
>     > > >>> > >
>     > > >>> > > This is a good suggestion. Since the suggestion
to add 2
>     > additional
>     > > >>> > > statuses analogous to the 3 proposed, it is a very
minor
> change
>     > of
>     > > no
>     > > >>> > > structural consequence to the KIP.
>     > > >>> > >
>     > > >>> > > I've updated the KIP to incorporate your suggestion,
and
> any
>     > voters
>     > > >>> who
>     > > >>> > > disagree should definitely respond in the thread.
>     > > >>> > >
>     > > >>> > > Cyrus
>     > > >>> > >
>     > > >>> > > On Thu, Jun 6, 2019 at 11:16 AM Konstantine Karantasis
<
>     > > >>> > > konstantine@confluent.io> wrote:
>     > > >>> > >
>     > > >>> > > > Thanks Cyrus,
>     > > >>> > > >
>     > > >>> > > > this is a nice and straightforward addition.
>     > > >>> > > >
>     > > >>> > > > I'm +1 too, but I'd like to return with a question
here
> as well
>     > > >>> > regarding
>     > > >>> > > > whether the unassigned tasks will be taken
into account.
>     > > >>> > > > Especially after KIP-415 we might start seeing
this
> status for
>     > > >>> specific
>     > > >>> > > > periods of time. Therefore, I think it's a
meaningful
> addition.
>     > > >>> > > > Then there's the `destroyed` status which might
be a lot
> more
>     > > >>> > transient but
>     > > >>> > > > we could also include for the sake of completion.
>     > > >>> > > > Check org.apache.kafka.connect.runtime.AbstractStatus
> for the
>     > > list
>     > > >>> of
>     > > >>> > all
>     > > >>> > > > possible statuses.
>     > > >>> > > >
>     > > >>> > > > Konstantine
>     > > >>> > > >
>     > > >>> > > > On Wed, Jun 5, 2019 at 4:32 PM Randall Hauch
<
> rhauch@gmail.com
>     > >
>     > > >>> wrote:
>     > > >>> > > >
>     > > >>> > > > > Thanks, Cyrus.
>     > > >>> > > > >
>     > > >>> > > > > +1 (binding)
>     > > >>> > > > >
>     > > >>> > > > > Randall Hauch
>     > > >>> > > > >
>     > > >>> > > > > On Wed, Jun 5, 2019 at 10:36 AM Andrew
Schofield <
>     > > >>> > > > > andrew_schofield@live.com>
>     > > >>> > > > > wrote:
>     > > >>> > > > >
>     > > >>> > > > > > +1 (non-binding)
>     > > >>> > > > > >
>     > > >>> > > > > > Andrew Schofield
>     > > >>> > > > > >
>     > > >>> > > > > > On 05/06/2019, 14:04, "Ryanne Dolan"
<
>     > ryannedolan@gmail.com
>     > > >
>     > > >>> > wrote:
>     > > >>> > > > > >
>     > > >>> > > > > >     +1 (non-binding)
>     > > >>> > > > > >
>     > > >>> > > > > >     Thanks
>     > > >>> > > > > >     Ryanne
>     > > >>> > > > > >
>     > > >>> > > > > >     On Tue, Jun 4, 2019, 11:29 PM
Cyrus Vafadari <
>     > > >>> > cyrus@confluent.io>
>     > > >>> > > > > > wrote:
>     > > >>> > > > > >
>     > > >>> > > > > >     > Hi all,
>     > > >>> > > > > >     >
>     > > >>> > > > > >     > Like like to start voting
in the following KIP:
>     > > >>> > > > > >     >
>     > > >>> > > > > >     >
>     > > >>> > > > > >
>     > > >>> > > > >
>     > > >>> > > >
>     > > >>> >
>     > > >>>
>     > >
>     >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-475%253A%2BNew%2BMetric%2Bto%2BMeasure%2BNumber%2Bof%2BTasks%2Bon%2Ba%2BConnector&amp;data=02%7C01%7C%7C5613cc7be8084f15b82a08d6f7f7a226%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636969040283157621&amp;sdata=DOweIjXCIxA%2Bo8V2uKmR7wPzR4Nceoph4%2B4BKotfTuI%3D&amp;reserved=0
>     > > >>> > > > > >     >
>     > > >>> > > > > >     > Discussion thread:
>     > > >>> > > > > >     >
>     > > >>> > > > > >     >
>     > > >>> > > > > >
>     > > >>> > > > >
>     > > >>> > > >
>     > > >>> >
>     > > >>>
>     > >
>     >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread.html%2Fbf7c92224aa798336c14d7e96ec8f2e3406c61879ec381a50652acfe%40%253Cdev.kafka.apache.org%253E&amp;data=02%7C01%7C%7C5613cc7be8084f15b82a08d6f7f7a226%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636969040283157621&amp;sdata=l9Te5p3evVOIVnSASAgThB5F1YEpo%2B1pMwkC7Nauyz4%3D&amp;reserved=0
>     > > >>> > > > > >     >
>     > > >>> > > > > >     > Thanks!
>     > > >>> > > > > >     >
>     > > >>> > > > > >     > Cyrus
>     > > >>> > > > > >     >
>     > > >>> > > > > >
>     > > >>> > > > > >
>     > > >>> > > > > >
>     > > >>> > > > >
>     > > >>> > > >
>     > > >>> >
>     > > >>> >
>     > > >>> >
>     > > >>> > --
>     > > >>> > Gwen Shapira
>     > > >>> > Product Manager | Confluent
>     > > >>> > 650.450.2760 | @gwenshap
>     > > >>> > Follow us: Twitter | blog
>     > > >>> >
>     > > >>>
>     > > >>>
>     > > >>> --
>     > > >>> -- Guozhang
>     > > >>>
>     > > >>
>     > >
>     >
>     >
>     > --
>     > -- Guozhang
>     >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message