kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription
Date Thu, 19 Jul 2018 18:44:13 GMT
Personally I'd prefer to keep the deprecation-related changes as small as
possible unless they are really necessary, and hence I'd prefer to just add

List<String> topicList()  /* or Set<String> topicSet() */

in addition to topicPattern to Source, in addition to `topicNameExtractor`
to Sink, and leaving the current APIs as-is.

Guozhang

On Thu, Jul 19, 2018 at 10:36 AM, Matthias J. Sax <matthias@confluent.io>
wrote:

> Thanks for updating the KIP.
>
> The current `Source` interface has a method `String topics()` atm. Thus,
> we cannot just add `Set<String> Source#topics()` because this would
> replace the existing method and would be an incompatible change.
>
> I think, we should deprecate `String topics()` and add a method with
> different name:
>
> `Set<String> Source#topicNames()`
>
> The method name `topicNames` is more appropriate anyway, as we return a
> set of String (ie, names) but no `Topic` objects. This raises one more
> thought: we might want to rename `Processor#stores()` to
> `Processor#storeNames()` as well ass `Sink#topic()` to
> `Sink#topicName()`, too. This would keep the naming in the API consistent.
>
>
> WDYT? Hope that other can chime in as well.
>
>
> -Matthias
>
> On 7/18/18 7:49 PM, Nishanth Pradeep wrote:
> > I have revised the KIP
> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+
> TopologyDescription+to+better+represent+Source+and+Sink+Nodes>.
> > Here is a summary of the changes.
> >
> >    1. Changed return type from String to Set<String> for Source#topics().
> >
> >    Set<String> Source#topics()
> >
> >    2. Added method in TopologyDescription#Source to return the Pattern
> used
> >    by the Source node.
> >
> >    Pattern Source#topicPattern()
> >
> >    3. Changed return type of Sink#topicNameExtractor from Class<? extends
> >    TopicNameExtractor> to just TopicNameExtractor.
> >
> >    TopicNameExtractor Sink#topicNameExtractor()
> >
> > Best,
> > Nishanth Pradeep
> >
> > On Sun, Jul 15, 2018 at 11:24 PM Guozhang Wang <wangguoz@gmail.com>
> wrote:
> >
> >> Hi Nishanth,
> >>
> >> Since even combining these two the scope is still relatively small I'd
> >> suggest just do it in one KIP if you're willing to work on them. If you
> do
> >> not then pleas feel free to create the JIRA for the second step so that
> >> others can pick it up.
> >>
> >>
> >> Guozhang
> >>
> >> On Sun, Jul 15, 2018 at 6:14 PM, Matthias J. Sax <matthias@confluent.io
> >
> >> wrote:
> >>
> >>> There is no general protocol. We can include the changes in the current
> >>> KIP or do a second KIP.
> >>>
> >>> If you don't want to include the change in this KIP, please create a
> new
> >>> JIRA to track the other changes. You can assign the JIRA to yourself
> and
> >>> start a second KIP if you want. You can also "link" both JIRAs as
> >>> related to each other.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 7/15/18 12:50 PM, Nishanth Pradeep wrote:
> >>>> Thank you for the comments! I agree with these changes.
> >>>>
> >>>> So is the general protocol to update the same KIP, or is to scrap the
> >>>> current KIP and create a new one since it's beyond the scope of the
> >>>> original KIP? I am happy to do either.
> >>>>
> >>>> On Wed, Jul 4, 2018 at 1:48 PM Matthias J. Sax <matthias@confluent.io
> >
> >>>> wrote:
> >>>>
> >>>>> Sounds good to me.
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 7/4/18 10:53 AM, Guozhang Wang wrote:
> >>>>>> After looked through the current TopologyDescription I think
I'd
> want
> >>> to
> >>>>>> combine the suggestions from John and Matthias on the API proposal.
> >> The
> >>>>>> motivations is that we have two relatively different functionalities
> >>>>>> provided from the APIs today:
> >>>>>>
> >>>>>> 1. Each interface's public functions, like
> >>>>>> SourceNode#topics(), GlobalStore#source(), which returns non-String
> >>> typed
> >>>>>> data. The hope was to let users programmatically leverage on
those
> >> APIs
> >>>>> for
> >>>>>> runtime checking.
> >>>>>> 2. Each interface's impl class also have an implicit toString()
> >>>>> overridden
> >>>>>> to print the necessary information. This was designed for debugging
> >>>>>> purposes only during development cycles.
> >>>>>>
> >>>>>> What we've observed so far, though, is that users leverage 2)
much
> >> more
> >>>>>> than 1) in practice, since it is more convienent to parse strings
> >> than
> >>>>>> recursively calling the APIs to get non-string fields. On the
other
> >>> hand,
> >>>>>> the discussion controversy is more around 1), not 2). As for
2)
> >> people
> >>>>> seem
> >>>>>> to be on the right page anyways: print the topic lists if it
is not
> >>>>>> dynamic, or print extractor string format otherwise. For 1)
above we
> >>>>> should
> >>>>>> probably have all three `Set<String> topics()`, `Pattern
> >>> topicPattern()`
> >>>>>> and `TopicNameExtractor topicExtractor()`; while for 2) I feel
> >>>>> comfortable
> >>>>>> relying on the TopicNameExtractor#toString() in `Source#toString()`
> >>> impl
> >>>>>> since even if users do not override this function, the default
value
> >>>>>> `className@hashcode` still looks fine to me.
> >>>>>>
> >>>>>>
> >>>>>> Guozhang
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jul 3, 2018 at 11:22 PM, Matthias J. Sax <
> >>> matthias@confluent.io>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> I just double checked the discussion thread of KIP-120 that
> >> introduced
> >>>>>>> `TopologyDescription`. Back than the argument was, that
using the
> >>>>>>> simplest option might be sufficient because the description
is
> >> mostly
> >>>>>>> used for debugging.
> >>>>>>>
> >>>>>>> Not sure if this argument holds. It seem that people built
first
> >> more
> >>>>>>> sophisticated tools using TopologyDescription.
> >>>>>>>
> >>>>>>> Final note: if we really want to add `topicPattern()` we
might want
> >> to
> >>>>>>> deprecate `topic()` and replace with `Set<String>
topics()`,
> >> because a
> >>>>>>> source node can take a multiple topics, too.
> >>>>>>>
> >>>>>>> Just adding this for completeness of context to the discussion.
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>> On 7/3/18 11:09 PM, Matthias J. Sax wrote:
> >>>>>>>> John,
> >>>>>>>>
> >>>>>>>> I am a little bit on the fence. In retrospective, it
might have
> >> been
> >>>>>>>> better to add `topic()` and `topicPattern()` to source
node and
> >>> return
> >>>>> a
> >>>>>>>> proper `Pattern` object instead of the pattern as a
String.
> >>>>>>>>
> >>>>>>>> All other "payload" is just names and thus String naturally.
From
> >> my
> >>>>>>>> point of view `TopologyDescription` should represent
the
> `Topology`
> >>> in
> >>>>> a
> >>>>>>>> "machine readable" form plus a default "human readable"
from via
> >>>>>>>> `toString()` -- this does not imply that all return
types should
> be
> >>>>>>> String.
> >>>>>>>>
> >>>>>>>> Let me know what you think. If you agree, we could even
add
> >>>>>>>> `Source#topicPattern()` in another KIP.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>> On 6/26/18 3:45 PM, John Roesler wrote:
> >>>>>>>>> Sorry for the late comment,
> >>>>>>>>>
> >>>>>>>>> Looking at the other pieces of TopologyDescription,
I noticed
> that
> >>>>>>> pretty
> >>>>>>>>> much all of the "payload" of these description nodes
are strings.
> >>>>>>> Should we
> >>>>>>>>> consider returning a string from `topicNameExtractor()`
instead?
> >>>>>>>>>
> >>>>>>>>> In fact, if we did that, we could consider calling
`toString()`
> on
> >>> the
> >>>>>>>>> extractor instead of returning the class name. This
would allow
> >>>>> authors
> >>>>>>> of
> >>>>>>>>> the extractors to provide more information about
the extractor
> >> than
> >>>>> just
> >>>>>>>>> its name. This might be especially useful in the
case of
> anonymous
> >>>>>>>>> implementations.
> >>>>>>>>>
> >>>>>>>>> Thanks for the KIP,
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Mon, Jun 25, 2018 at 11:52 PM Ted Yu <yuzhihong@gmail.com>
> >>> wrote:
> >>>>>>>>>
> >>>>>>>>>> My previous response was talking about the new
method in
> >>>>>>>>>> InternalTopologyBuilder.
> >>>>>>>>>> The exception just means there is no uniform
extractor for all
> >> the
> >>>>>>> sinks.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jun 25, 2018 at 8:02 PM, Matthias J.
Sax <
> >>>>>>> matthias@confluent.io>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Ted,
> >>>>>>>>>>>
> >>>>>>>>>>> Why? Each sink can have a different TopicNameExtractor.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>> On 6/25/18 5:19 PM, Ted Yu wrote:
> >>>>>>>>>>>> If there are different TopicNameExtractor
classes from
> multiple
> >>>>> sink
> >>>>>>>>>>> nodes,
> >>>>>>>>>>>> the new method should throw exception
alerting user of such
> >>>>> scenario.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Jun 25, 2018 at 2:23 PM, Bill
Bejeck <
> >> bbejeck@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the KIP!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Overall I'm +1 on the KIP.   I have
one question.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The KIP states that the method "topicNameExtractor()"
is
> added
> >>> to
> >>>>>>> the
> >>>>>>>>>>>>> InternalTopologyBuilder.java.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> It could be that I'm missing something,
but wow does this
> work
> >>> if
> >>>>> a
> >>>>>>>>>> user
> >>>>>>>>>>>>> has provided different TopicNameExtractor
instances to
> >> multiple
> >>>>> sink
> >>>>>>>>>>> nodes?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Bill
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Jun 25, 2018 at 1:25 PM
Guozhang Wang <
> >>> wangguoz@gmail.com
> >>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yup I agree, generally speaking
the `toString()` output is
> >> not
> >>>>>>>>>>>>> recommended
> >>>>>>>>>>>>>> to be relied on programmatically
in user's code, but we've
> >>>>> observed
> >>>>>>>>>>>>>> convenience-beats-any-other-reasons
again and again in
> >>>>> development
> >>>>>>>>>>>>>> unfortunately. I think we should
still not claiming it is
> >> part
> >>> of
> >>>>>>> the
> >>>>>>>>>>>>>> public APIs that would not be
changed anyhow in the future,
> >> but
> >>>>>>> just
> >>>>>>>>>>>>>> mentioning it in the wiki for
people to be aware is fine.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Sun, Jun 24, 2018 at 5:01
PM, Matthias J. Sax <
> >>>>>>>>>>> matthias@confluent.io>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for the KIP!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I am don't have any further
comments.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For Guozhang's comment:
if we mention anything about
> >>>>> `toString()`,
> >>>>>>>>>> we
> >>>>>>>>>>>>>>> should make explicit that
`toString()` output is still not
> >>>>> public
> >>>>>>>>>>>>>>> contract and users should
not rely on the output.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Furhtermore, for the actual
uses output, I would replace
> >>>>> "topic:"
> >>>>>>> by
> >>>>>>>>>>>>>>> "extractor class:" to make
the difference obvious.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I am just hoping that people
actually to not rely on
> >>>>> `toString()`
> >>>>>>>>>> what
> >>>>>>>>>>>>>>> defeats the purpose to the
`TopologyDescription` class that
> >>> was
> >>>>>>>>>>>>>>> introduced to avoid the
dependency... (Just a side comment,
> >>> not
> >>>>>>>>>> really
> >>>>>>>>>>>>>>> related to this KIP proposal
itself).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If there are no further
comments in the next days, feel
> free
> >>> to
> >>>>>>>>>> start
> >>>>>>>>>>>>>>> the VOTE and open a PR.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 6/22/18 6:04 PM, Guozhang
Wang wrote:
> >>>>>>>>>>>>>>>> Thanks for writing the
KIP!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'm +1 on the proposed
changes over all. One minor
> >>> suggestion:
> >>>>> we
> >>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>> also mention that the
`Sink#toString` will also be
> updated,
> >>> in
> >>>>> a
> >>>>>>>>>> way
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> if `topic()` returns
null, use the other call, etc. This
> is
> >>>>>>> because
> >>>>>>>>>>>>>>>> although we do not explicitly
state the following logic as
> >>>>> public
> >>>>>>>>>>>>>>> protocols:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> "Sink: " + name + "
(topic: " + topic() + ")\n      <-- "
> +
> >>>>>>>>>>>>>>>> nodeNames(predecessors);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> There are already some
users that rely on
> >>>>>>>>>>>>>> `topology.describe().toString(
> >>>>>>>>>>>>>>> )`
> >>>>>>>>>>>>>>>> to have runtime checks
on the returned string values, so
> >>>>> changing
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>> means that their app
will break and hence need to make
> code
> >>>>>>>>>> changes.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, Jun 20, 2018
at 7:20 PM, Nishanth Pradeep <
> >>>>>>>>>>>>>> nishanthp21@gmail.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hello Everyone,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I have created a
new KIP to discuss extending
> >>>>>>> TopologyDescription.
> >>>>>>>>>>>>> You
> >>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> find it here:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>>>> 321%3A+Add+method+to+get+TopicNameExtractor+in+
> >>>>>>> TopologyDescription
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Please provide any
feedback that you might have.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Nishanth Pradeep
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message