kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Egerton <chr...@confluent.io>
Subject Re: [DISCUSS] KIP-454: Expansion of the ConnectClusterState interface
Date Wed, 08 May 2019 00:30:32 GMT
Hi all,

Now that  KAFKA-8304 (https://issues.apache.org/jira/browse/KAFKA-8304),
which was a blocker, has been addressed, I've published a PR for these
changes: https://github.com/apache/kafka/pull/6584

Thanks to everyone who's voted so far! If anyone else is interested, the
voting thread can be found here:
https://www.mail-archive.com/dev@kafka.apache.org/msg97458.html. Current
status: +1 binding, +2 non-binding.

Cheers,

Chris

On Tue, Apr 30, 2019 at 12:40 PM Chris Egerton <chrise@confluent.io> wrote:

> Hi Konstantine,
>
> I've updated the KIP to add default method implementations to the list of
> rejected alternatives. Technically this makes the changes in the KIP
> backwards incompatible, but I think I agree that for the majority of cases
> where it would even be an issue a compile-time error is likely to be more
> beneficial than one at run time.
>
> Thanks for your thoughts and thanks for the LGTM!
>
> Cheers,
>
> Chris
>
> On Mon, Apr 29, 2019 at 12:29 PM Konstantine Karantasis <
> konstantine@confluent.io> wrote:
>
>> Hi Chris,
>>
>> Thanks for considering my suggestion regarding default implementations for
>> the new methods.
>> However, given that these implementations don't seem to have sane defaults
>> and throw UnsupportedOperationException, I think we'll be better without
>> defaults.
>> Seems that a compile time error is preferable here, for those who want to
>> upgrade their implementations.
>>
>> Otherwise, the KIP LGTM.
>>
>> Konstantine
>>
>> On Thu, Apr 25, 2019 at 10:29 PM Magesh Nandakumar <mageshn@confluent.io>
>> wrote:
>>
>> > Thanks a lot, Chris. The KIP looks good to me.
>> >
>> > On Thu, Apr 25, 2019 at 7:35 PM Chris Egerton <chrise@confluent.io>
>> wrote:
>> >
>> > > Hi Magesh,
>> > >
>> > > Sounds good; I've updated the KIP to make ConnectClusterDetails an
>> > > interface. If we want to leave the door open to expand it in the
>> future
>> > it
>> > > definitely makes sense to treat it similarly to how we're treating the
>> > > ConnectClusterState interface now.
>> > >
>> > > Thanks,
>> > >
>> > > Chris
>> > >
>> > > On Thu, Apr 25, 2019 at 4:18 PM Magesh Nandakumar <
>> mageshn@confluent.io>
>> > > wrote:
>> > >
>> > > > HI Chrise,
>> > > >
>> > > > Overall it looks good to me. Just one last comment - I was
>> wondering if
>> > > > ConnectClusterDetail should be an interface just like
>> > > ConnectClusterState.
>> > > >
>> > > > Thanks,
>> > > > Magesh
>> > > >
>> > > > On Thu, Apr 25, 2019 at 3:54 PM Chris Egerton <chrise@confluent.io>
>> > > wrote:
>> > > >
>> > > > > Hi Magesh,
>> > > > >
>> > > > > Expanding the type we use to convey cluster metadata from just
a
>> > Kafka
>> > > > > cluster ID string to its own class seems like a good idea for
the
>> > sake
>> > > of
>> > > > > forwards compatibility, but I'm still not sure what the gains
of
>> > > > including
>> > > > > the cluster group ID would be--it's a simple map lookup away
in
>> the
>> > > REST
>> > > > > extension's configure(...) method. Including information on
>> whether
>> > the
>> > > > > cluster is distributed or standalone definitely seems convenient;
>> as
>> > > far
>> > > > as
>> > > > > I can tell there's no easy way to do that from within a REST
>> > extension
>> > > at
>> > > > > the moment, and relying on something like the presence of a
>> group.id
>> > > > > property to identify a distributed cluster could result in false
>> > > > positives.
>> > > > > However, is there a use case for it? If not, we can note that
as a
>> > > > possible
>> > > > > addition to the ConnectClusterDetails class for later but leave
it
>> > out
>> > > > for
>> > > > > now.
>> > > > >
>> > > > > I've updated the KIP to include the new ConnectClusterDetails
>> class
>> > but
>> > > > > left out the cluster type information for now; let me know what
>> you
>> > > > think.
>> > > > >
>> > > > > Thanks again for your thoughts!
>> > > > >
>> > > > > Cheers,
>> > > > >
>> > > > > Chris
>> > > > >
>> > > > > On Wed, Apr 24, 2019 at 4:49 PM Magesh Nandakumar <
>> > > mageshn@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Chris,
>> > > > > >
>> > > > > > Instead of calling it ConnectClusterId, perhaps call it
>> > > > > > ConnectClusterDetails which can include things like groupid,
>> > > underlying
>> > > > > > kafkaclusterId, standalone or distributed, etc. This will
help
>> > expose
>> > > > any
>> > > > > > cluster related information in the future.
>> > > > > > Let me know if that would work?
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Magesh
>> > > > > >
>> > > > > > On Wed, Apr 24, 2019 at 4:26 PM Chris Egerton <
>> chrise@confluent.io
>> > >
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hi Magesh,
>> > > > > > >
>> > > > > > > 1. After ruminating for a little while on the inclusion
of a
>> > method
>> > > > to
>> > > > > > > retrieve task configurations I've tentatively decided
to
>> remove
>> > it
>> > > > from
>> > > > > > the
>> > > > > > > proposal and place it in the rejected alternatives
section. If
>> > > anyone
>> > > > > > > presents a reasonable use case for it I'll be happy
to discuss
>> > > > further
>> > > > > > but
>> > > > > > > right now I think this is the way to go. Thanks for
your
>> > > suggestion!
>> > > > > > >
>> > > > > > > 2. The idea of a Connect cluster ID method is certainly
>> > > fascinating,
>> > > > > but
>> > > > > > > there are a few questions it raises. First off, what
would the
>> > > > > group.id
>> > > > > > be
>> > > > > > > for a standalone cluster? Second, why return a formatted
>> string
>> > > there
>> > > > > > > instead of a new class such as a ConnectClusterId that
>> provides
>> > the
>> > > > two
>> > > > > > in
>> > > > > > > separate methods? And lastly, since REST extensions
are
>> > configured
>> > > > with
>> > > > > > all
>> > > > > > > of the properties available to the worker, wouldn't
it be
>> > possible
>> > > to
>> > > > > > just
>> > > > > > > get the group ID of the Connect cluster from there?
The reason
>> > I'd
>> > > > like
>> > > > > > to
>> > > > > > > see the Kafka cluster ID made available to REST extensions
is
>> > that
>> > > > > > > retrieving it isn't as simple as reading a configuration
from
>> a
>> > > > > > properties
>> > > > > > > map and instead involves creating an admin client from
those
>> > > > properties
>> > > > > > and
>> > > > > > > using it to perform a `describe cluster` call, which
comes
>> with
>> > its
>> > > > own
>> > > > > > > pitfalls as far as error handling, interruptions, and
timeouts
>> > go.
>> > > > > Since
>> > > > > > > this information is available to the herder already,
it seems
>> > like
>> > > a
>> > > > > good
>> > > > > > > tradeoff to expose that information to REST extensions
so that
>> > > > > developers
>> > > > > > > don't have to duplicate that logic themselves. I'm
unsure that
>> > the
>> > > > same
>> > > > > > > arguments would apply to exposing a group.id to REST
>> extensions
>> > > > > through
>> > > > > > > the
>> > > > > > > ConnectClusterInterface. What do you think?
>> > > > > > >
>> > > > > > > Thanks again for your thoughts!
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > >
>> > > > > > > Chris
>> > > > > > >
>> > > > > > > On Tue, Apr 23, 2019 at 4:18 PM Magesh Nandakumar <
>> > > > > mageshn@confluent.io>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Chris,
>> > > > > > > >
>> > > > > > > > I certainly would love to hear others thoughts
on #1 but
>> IMO,
>> > it
>> > > > > might
>> > > > > > > not
>> > > > > > > > be as useful as ConnectorConfigs and as you mentioned,
we
>> could
>> > > > > always
>> > > > > > > add
>> > > > > > > > it when the need arises.
>> > > > > > > > Thanks for clarifying the details on my concern
#2 regarding
>> > the
>> > > > > > > > kafkaClusterId. While not a perfect fit in the
interface,
>> I'm
>> > not
>> > > > > > > > completely opposed to having it in the interface.
The other
>> > > > option, I
>> > > > > > can
>> > > > > > > > think is to expose a connectClusterId() returning
group.id
>> +
>> > > > > > > > kafkaClusterId
>> > > > > > > > (with some delimiter) rather than returning the
>> kafkaClusterId.
>> > > If
>> > > > we
>> > > > > > > > choose to go this route, we can even make this
a first-class
>> > > > citizen
>> > > > > of
>> > > > > > > the
>> > > > > > > > Herder interface. Let me know what you think.
>> > > > > > > >
>> > > > > > > > Thanks
>> > > > > > > > Magesh
>> > > > > > > >
>> > > > > > > > On Thu, Apr 18, 2019 at 2:45 PM Chris Egerton
<
>> > > chrise@confluent.io
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi Magesh,
>> > > > > > > > >
>> > > > > > > > > Thanks for your comments. I'll address them
in the order
>> you
>> > > > > provided
>> > > > > > > > them:
>> > > > > > > > >
>> > > > > > > > > 1 - Reason for exposing task configurations
to REST
>> > extensions:
>> > > > > > > > > Yes, the motivation is a little thin for
exposing task
>> > configs
>> > > to
>> > > > > > REST
>> > > > > > > > > extensions. I can think of a few uses for
this
>> functionality,
>> > > > such
>> > > > > as
>> > > > > > > > > attempting to infer problematic configurations
by
>> examining
>> > > > failed
>> > > > > > > tasks
>> > > > > > > > > and comparing their configurations to the
configurations
>> of
>> > > > running
>> > > > > > > > tasks,
>> > > > > > > > > but like you've indicated it's dubious that
the best place
>> > for
>> > > > > > anything
>> > > > > > > > > like that belongs in a REST extension.
>> > > > > > > > > I'd be interested to hear others' thoughts,
but right now
>> I'm
>> > > not
>> > > > > too
>> > > > > > > > > opposed to erring on the side of caution
and leaving it
>> out.
>> > > > Worst
>> > > > > > > case,
>> > > > > > > > it
>> > > > > > > > > takes another KIP to add this later on down
the road, but
>> > > that's
>> > > > a
>> > > > > > > small
>> > > > > > > > > price to pay to avoid adding support for
a feature that
>> > nobody
>> > > > > needs.
>> > > > > > > > >
>> > > > > > > > > 2. Usefulness of exposing Kafka cluster ID
to REST
>> > extensions:
>> > > > > > > > > As the KIP states, "the Kafka cluster ID
may be useful for
>> > the
>> > > > > > purpose
>> > > > > > > of
>> > > > > > > > > uniquely identifying a Connect cluster from
within a REST
>> > > > > extension,
>> > > > > > > > since
>> > > > > > > > > users may be running multiple Kafka clusters
and the
>> > group.id
>> > > > for
>> > > > > a
>> > > > > > > > > distributed Connect cluster may not be sufficient
to
>> > identify a
>> > > > > > > cluster."
>> > > > > > > > > Even though there may be producer or consumer
overrides
>> for
>> > > > > > > > > bootstrap.servers present in the configuration
for the
>> > worker,
>> > > > > these
>> > > > > > > will
>> > > > > > > > > not affect which Kafka cluster is used as
a backing store
>> for
>> > > > > > connector
>> > > > > > > > > configurations, offsets, and statuses, so
the Kafka
>> cluster
>> > ID
>> > > > for
>> > > > > > the
>> > > > > > > > > worker in conjunction with the Connect group
ID should be
>> > > > > sufficient
>> > > > > > to
>> > > > > > > > > uniquely identify a Connect cluster.
>> > > > > > > > > We can and should document that the Connect
cluster with
>> > > > overridden
>> > > > > > > > > producer.bootstrap.servers or consumer.bootstrap.servers
>> may
>> > be
>> > > > > > writing
>> > > > > > > > > to/reading from a different Kafka cluster.
However, REST
>> > > > extensions
>> > > > > > are
>> > > > > > > > > already passed the configs for their worker
through their
>> > > > > > > configure(...)
>> > > > > > > > > method, so they'd be able to detect any such
overrides and
>> > act
>> > > > > > > > accordingly.
>> > > > > > > > >
>> > > > > > > > > Thanks again for your thoughts!
>> > > > > > > > >
>> > > > > > > > > Cheers,
>> > > > > > > > >
>> > > > > > > > > Chris
>> > > > > > > > >
>> > > > > > > > > On Thu, Apr 18, 2019 at 11:08 AM Magesh Nandakumar
<
>> > > > > > > mageshn@confluent.io
>> > > > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Chris,
>> > > > > > > > > >
>> > > > > > > > > > Thanks for the KIP. Overall, it looks
good and
>> > > straightforward
>> > > > to
>> > > > > > me.
>> > > > > > > > > >
>> > > > > > > > > > I had a few questions on the new methods
>> > > > > > > > > >
>> > > > > > > > > > 1. I'm not sure if an extension would
ever require the
>> task
>> > > > > > configs.
>> > > > > > > An
>> > > > > > > > > > extension generally should only require
the health and
>> > > current
>> > > > > > state
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > connector which includes the connector
config. I was
>> > > wondering
>> > > > if
>> > > > > > > there
>> > > > > > > > > is
>> > > > > > > > > > a specific reason it would need task
configs.
>> > > > > > > > > > 2. Also, I'm not convinced that kafkaClusterId()
>> belongs to
>> > > the
>> > > > > > > > > > ConnectClusterState
>> > > > > > > > > > interface. The interface is generally
to provide
>> > information
>> > > > > about
>> > > > > > > the
>> > > > > > > > > > Connect cluster and its information.
 Also, the
>> > > kafkaClusterId
>> > > > > > could
>> > > > > > > > > > potentially change based on whether
there is a
>> "producer."
>> > or
>> > > > > > > > "consumer."
>> > > > > > > > > > prefix, right?
>> > > > > > > > > >
>> > > > > > > > > > Having said that, I would prefer to
have
>> connectorConfigs
>> > > > which I
>> > > > > > > think
>> > > > > > > > > is
>> > > > > > > > > > a great idea and addition to the interface.
Let me know
>> > what
>> > > > you
>> > > > > > > think.
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > > Magesh
>> > > > > > > > > >
>> > > > > > > > > > On Sat, Apr 13, 2019 at 9:00 PM Chris
Egerton <
>> > > > > chrise@confluent.io
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi all,
>> > > > > > > > > > >
>> > > > > > > > > > > I've posted "KIP-454: Expansion
of the
>> > ConnectClusterState
>> > > > > > > > interface",
>> > > > > > > > > > > which proposes that we add provide
more information
>> about
>> > > the
>> > > > > > > Connect
>> > > > > > > > > > > cluster to REST extensions.
>> > > > > > > > > > >
>> > > > > > > > > > > The KIP can be found at
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-454%3A+Expansion+of+the+ConnectClusterState+interface
>> > > > > > > > > > >
>> > > > > > > > > > > I'm eager to hear people's thoughts
on this and will
>> > > > appreciate
>> > > > > > any
>> > > > > > > > > > > feedback.
>> > > > > > > > > > >
>> > > > > > > > > > > Cheers,
>> > > > > > > > > > >
>> > > > > > > > > > > Chris
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message