kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantine Karantasis <konstant...@confluent.io>
Subject Re: [DISCUSS] KIP-411: Add option to make Kafka Connect task client ID values unique
Date Sat, 05 Jan 2019 01:54:53 GMT
Hi Paul.

I second Ewen and I intended to give similar feedback:

1) Can we avoid a config altogether?
2) If we prefer to add a config anyways, can we use a set of allowed values
instead of a boolean, even if initially these values are only two? As the
discussion on Jira highlights, there is a potential for more naming
conventions in the future, even if now the extra functionality doesn't seem
essential. It's not optimal to have to deprecate a config instead of just
extending its set of values.
3) I agree, the config name sounds too general. How about
"client.ids.naming.policy" or "client.ids.naming" if you want two more
options?

Konstantine

On Fri, Jan 4, 2019 at 7:38 AM Ewen Cheslack-Postava <ewen@confluent.io>
wrote:

> Hi Paul,
>
> Thanks for the KIP. A few comments.
>
> To me, biggest question here is if we can fix this behavior without adding
> a config. In particular, today, we don't even set the client.id for the
> producer and consumer at all, right? The *only* way it is set is if you
> include an override in the worker config, but in that case you need to be
> explicitly opting in with a `producer.` or `consumer.` prefix, i.e. the
> settings are `producer.client.id` and `consumer.client.id`. Otherwise, I
> think we're getting the default behavior where we generate unique,
> per-process IDs, i.e. via this logic
>
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L662-L664
>
> If that's the case, would it maybe be possible to compatibly change the
> default to use task IDs in the client ID, but only if we don't see an
> existing override from the worker config? This would only change the
> behavior when someone is using the default, but since the default would
> just use what is effectively a random ID that is useless for monitoring
> metrics, presumably this wouldn't affect any existing users. I think that
> would avoid having to introduce the config, give better out of the box
> behavior, and still be a safe, compatible change to make.
>
>
> Other than that, just two minor comments. On the config naming, not sure
> about a better name, but I think the config name could be a bit clearer if
> we need to have it. Maybe something including "task" like
> "task.based.client.ids" or something like that (or change the type to be an
> enum and make it something like task.client.ids=[default|task] and leave it
> open for extension in the future if needed).
>
> Finally, you have this:
>
> *"Allow overriding client.id <http://client.id/> on a per-connector
> basis"*
> >
> > This is a much more complex change, and would require individual
> > connectors to be updated to support the change. In contrast, the proposed
> > approach would immediately allow detailed consumer/producer monitoring
> for
> > all existing connectors.
> >
>
> I don't think this is quite accurate. I think the reason to reject is that
> for your particular requirement for metrics, it simply doesn't give enough
> granularity (there's only one value per entire connector), but it doesn't
> require any changes to connectors. The framework allocates all of these and
> there are already framework-defined config values that all connectors share
> (some for only sinks or sources), so the framework can handle all of this
> without changes to connectors. Further, with connector-specific overrides,
> you could get task-specific values if interpolation were supported in the
> config value (as we now do with managed secrets). For example, it could
> support something like client.id=connector-${taskId} and the task ID would
> be substituted automatically into the string.
>
> I don't necessarily like that solution (seems complicated and not a great
> user experience), but it could work.
>
> -Ewen
>
>
>
>
> On Thu, Dec 20, 2018 at 5:05 PM Paul Davidson <pdavidson@salesforce.com>
> wrote:
>
> > Hi everyone,
> >
> > I would like to start a discussion around the following KIP:
> > *
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Add+option+to+make+Kafka+Connect+task+client+ID+values+unique
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Add+option+to+make+Kafka+Connect+task+client+ID+values+unique
> > >*
> >
> > This proposes a small change to allow Kafka Connect the option to
> > auto-generate unique client IDs for each task. This enables granular
> > monitoring of the producer / consumer client in each task.
> >
> > Feedback is appreciated, thanks in advance!
> >
> > Paul
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message