kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: [DISCUSS] KIP-439: Deprecate Interface WindowStoreIterator
Date Tue, 07 May 2019 23:19:54 GMT
Thanks John.

I think I'm convinced about not collapsing the metrics measuring for
various function calls. Regarding the possible solution I'm a bit inclined
to just distinguish on the metric names directly over on tags: since our
current metrics naming are a tad messed up anyways, fixing them in one shot
as a breaking change sounds reasonable to me.


Guozhang


On Mon, May 6, 2019 at 9:14 AM John Roesler <john@confluent.io> wrote:

> Thanks all (or range? ;) ) for the discussion. Good points all around.
>
> Although I see the allure of naming the metrics the same as the things
> they're measuring, it seems not to be perfect. Seconding Matthias's latter
> thought, I think it's likely you'd want to measure the method calls
> independently, since the different range variants would have wildly
> different characteristics, which could then lead you to want to orient the
> storage differently to support particular use cases.
>
> Pointing out some structural characteristics (I know you all know this
> stuff, I'm just constructing a table for analysis):
> * Java supports method-name overloading. *Different* methods can have the
> same names, distinguished by arg lists; it doesn't change the fact that
> they are different methods.
> * Metrics does not support metric-name overloading, but metric names do
> have some structure we could exploit, if you consider the tags.
>
> It seems to me that there's actually more domain mismatch if we just have
> one metric named "range", since (e.g., in the SessionStore proposal above)
> the Java API has *four* methods named "range".
>
> Two potential solutions I see:
> * hierarchical metric names: "range-single-key-all-time",
> "range-key-range-all-time", "range-single-key-time-range",
> "range-key-range-time-range", maybe with better names... I'm not the best
> at this stuff. Hopefully, you see the point, though... they all start with
> "range", which provides the association to the method, and all have a
> suffix which identifies the overload being measured.
> * tagged metric names: "range" {"variant": "single-key-all-time"}, "range"
> {"variant": "key-range-all-time"}, "range" {"variant":
> "single-key-time-range"}, "range" {"variant": "key-range-time-range"} . Or
> you could even split the tags up semantically, but my instinct says that
> that would just make it harder to digest the metrics later on.
>
> Just some ideas.
> -John
>
> On Fri, Apr 26, 2019 at 3:51 AM Matthias J. Sax <matthias@confluent.io>
> wrote:
>
> > Thanks for the input Guozhang. I was not aware of those dependencies.
> >
> > It might be good to align this KIP with the metrics cleanup. Not sure
> > atm, if we should use different metric names for different overloads,
> > even if those have the same method name?
> >
> > If we rename all method to `range()` and use the same metric name for
> > all, one could argue that this is still fine, because the metric
> > collects how often a range query is executed (regardless of the range
> > itself).
> >
> > On the other hand, this would basically be a "roll up". It could still
> > be valuable to distinguish between single-key-time-range,
> > key-range-time-range, and all-range queries. Users could still aggregate
> > those later if they are not interested in the details, while it's not
> > possible for user to split a pre-aggregated metric into it's component.
> >
> >
> > Input from others might be helpful here, too.
> >
> >
> > -Matthias
> >
> > On 4/11/19 6:00 PM, Guozhang Wang wrote:
> > > While working at KIP-444 (https://github.com/apache/kafka/pull/6498) I
> > > realized there are a bunch of issues on metric names v.s. function
> names,
> > > e.g. some function named `fetchAll` are actually measure with `fetch`,
> > etc.
> > > So in that KIP I proposed to make the function name aligned with
> metrics
> > > name. So suppose we rename the functions from `fetch` to `range` I'd
> > > suggest we make this change as part of KIP-444 as well. Note that it
> > means
> > > different functions with the same name `range` will be measured under a
> > > single metric then.
> > >
> > > But still for function named `all` it will be measured under a separate
> > > metric named `all`, so I'm just clarifying with you if that's the
> > intention.
> > >
> > >
> > > Guozhang
> > >
> > > On Thu, Apr 11, 2019 at 2:04 PM Matthias J. Sax <matthias@confluent.io
> >
> > > wrote:
> > >
> > >> I did not see a reason to rename `all()` to `range()`. `all()` does
> not
> > >> take any parameters to limit a range and is a good name IMHO. But I am
> > >> not married to keep `all()` and if we think we should rename it, too,
> I
> > >> am fine with it.
> > >>
> > >> Not sure what connection you make to metrics though. Can you
> elaborate?
> > >>
> > >>
> > >> Would be interested to hear others opinions on this, too.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 4/11/19 8:38 AM, Guozhang Wang wrote:
> > >>> I like the renaming, since it also aligns with our metrics cleanup
> > >>> (KIP-444) which touches upon the store level metrics as well.
> > >>>
> > >>> One question: you seems still suggesting to keep "all" with the
> current
> > >>> name (and also using a separate metric for it), what's the difference
> > >>> between this one and other "range" functions?
> > >>>
> > >>>
> > >>> Guozhang
> > >>>
> > >>> On Thu, Apr 11, 2019 at 2:26 AM Matthias J. Sax <
> matthias@confluent.io
> > >
> > >>> wrote:
> > >>>
> > >>>> Thanks for the input.
> > >>>>
> > >>>>>> Just to clarify the naming conflicts is between the newly
added
> > >> function
> > >>>>>> and the old functions that we want to deprecate / remove
right?
> The
> > >>>>
> > >>>> Yes, the conflict is just fort the existing `fetch()` methods for
> > which
> > >>>> we want to change the return type.
> > >>>>
> > >>>> IMHO, we should not make a breaking change in a minor release.
Thus,
> > we
> > >>>> could either only deprecate those fetch methods that return
> > >>>> `WindowStoreIterator` and do a "clean cut" in 3.0.
> > >>>>
> > >>>> Or we, follow the renaming path. No get a clean renaming, we need
to
> > >>>> consider all methods that are called "fetch":
> > >>>>
> > >>>> ReadOnlyWindowStore:
> > >>>>
> > >>>>> V fetch(K, long)
> > >>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)
> > >>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant,
Instant)
> > >>>>
> > >>>> WindowStore:
> > >>>>
> > >>>>> WindowStoreIterator<V> fetch(K, long, long)
> > >>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)>
> > >>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, long,
long)
> > >>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant,
Instant)
> > >>>>
> > >>>> There is also fetchAll(long, long) and fetchAll(Instant, Instant)
> that
> > >>>> fetch over all keys.
> > >>>>
> > >>>> Maybe we could rename `V fetch(K, long)` to `V get(K, long)` and
> > rename
> > >>>> all `fetch()/fetchAll()` to `range()`? There is actually no reason
> to
> > >>>> have range()/rangeAll().
> > >>>>
> > >>>> If we do this, we might consider to rename methods for SessionStore,
> > >>>> too. There is
> > >>>>
> > >>>>> ReadOnlySessionStore#fetch(K)
> > >>>>> ReadOnlySessionStore#fetch(K, K)
> > >>>>> SessionStore#findSessions(K, long, long)
> > >>>>> SessionStore#findSessions(K, K, long, long)
> > >>>>> SessionStore#fetchSession(K, long, long);
> > >>>>
> > >>>> Consistent renaming might be:
> > >>>>
> > >>>> ReadOnlySessionStore#range(K)
> > >>>> ReadOnlySessionStore#range(K, K)
> > >>>> SessionStore#range(K, long, long)
> > >>>> SessionStore#range(K, K, long, long)
> > >>>> SessionStore#get(K, long, long);
> > >>>>
> > >>>>
> > >>>> Not sure if extensive renaming might have too big of an impact.
> > However,
> > >>>> `range()` and `get()` might actually be better names than `fetch()`,
> > >>>> thus, it might also provide some additional value.
> > >>>>
> > >>>> Thoughts?
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>>
> > >>>> On 3/27/19 10:19 AM, Guozhang Wang wrote:
> > >>>>> Hello Matthias,
> > >>>>>
> > >>>>> Just to clarify the naming conflicts is between the newly added
> > >> function
> > >>>>> and the old functions that we want to deprecate / remove right?
The
> > >>>>> existing ones have different signatures with parameters so
that
> they
> > >>>> should
> > >>>>> not have conflicts.
> > >>>>>
> > >>>>> I was thinking about just make the change directly without
> > deprecating
> > >>>>> existing ones which would require users of 2.3 to make code
changes
> > --
> > >>>> this
> > >>>>> code change looks reasonably straight-forward to me and not
much
> > worth
> > >>>>> deferring to later when the deprecated ones are removed.
> > >>>>>
> > >>>>> On the other hand, just deprecating "WindowIterator" without
add
> new
> > >>>>> functions seems not very useful for users either since it is
only
> > used
> > >> as
> > >>>>> an indicator but users cannot make code changes during this
phase
> > >>>> anyways,
> > >>>>> so it is still a `one-cut` deal when we eventually remove the
> > >> deprecated
> > >>>>> ones and add the new one.
> > >>>>>
> > >>>>> Hence I'm slightly inclining to trade compatibility and replace
it
> > with
> > >>>> new
> > >>>>> functions in one release, but if people have a good idea of
the
> > >> renaming
> > >>>>> approach (I do not have a good one on top of my head :) I can
also
> be
> > >>>>> convinced that way.
> > >>>>>
> > >>>>> Guozhang
> > >>>>>
> > >>>>>
> > >>>>> On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax <
> > >> matthias@confluent.io>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> I am open to change the return type to
> > >>>>>>
> > >>>>>> KeyValueIterator<Windowed<K>, V>
> > >>>>>>
> > >>>>>> However, this requires to rename
> > >>>>>>
> > >>>>>> #fetch(K key, long startTimestamp, long endTimestamp)
> > >>>>>> #fetch(K key, Instant startTimestamp, Instant endTimestamp)
> > >>>>>>
> > >>>>>> to avoid naming conflicts.
> > >>>>>>
> > >>>>>> What new name would you suggest? The existing methods are
called
> > >>>>>> `fetch()`, `fetchAll()`, `all()`, `put()`.
> > >>>>>>
> > >>>>>> While I think it would be good to get fully aligned return
types,
> I
> > am
> > >>>>>> not sure how we can get aligned method names (without renaming
all
> > of
> > >>>>>> them...)? If we think it's worth to rename all to get this
cleaned
> > >> up, I
> > >>>>>> am no opposed.
> > >>>>>>
> > >>>>>>
> > >>>>>> Thoughts?
> > >>>>>>
> > >>>>>>
> > >>>>>> -Matthias
> > >>>>>>
> > >>>>>>
> > >>>>>> On 3/11/19 10:27 AM, Guozhang Wang wrote:
> > >>>>>>> I was thinking about changing the return type even,
to
> > >>>>>>> `KeyValueIterator<Windowed<K>, V>` since
it is confusing to users
> > >> about
> > >>>>>> the
> > >>>>>>> key typed `Long` (Streams javadoc today did not explain
it
> clearly
> > >>>>>> either),
> > >>>>>>> note it is not backward compatible at all.
> > >>>>>>>
> > >>>>>>> Personally I'd prefer to just deprecate the API and
new new ones
> > that
> > >>>>>>> return `KeyValueIterator<Windowed<K>, V>`
directly, but if most
> > >> people
> > >>>>>> felt
> > >>>>>>> it is too intrusive for compatibility I can be convinced
with
> > >>>>>>> `KeyValueIterator<Long, V>` as well.
> > >>>>>>>
> > >>>>>>> Guozhang
> > >>>>>>>
> > >>>>>>> On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman
<
> > >>>>>> sophie@confluent.io>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> I remember thinking this while working on window
stores, am
> > >> definitely
> > >>>>>> for
> > >>>>>>>> it.
> > >>>>>>>>
> > >>>>>>>> On Mon, Mar 11, 2019 at 9:20 AM John Roesler <john@confluent.io
> >
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Sounds great to me. Thanks, Matthias!
> > >>>>>>>>> -John
> > >>>>>>>>>
> > >>>>>>>>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J.
Sax <
> > >>>>>> matthias@confluent.io>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi,
> > >>>>>>>>>>
> > >>>>>>>>>> I would like to propose KIP-439 to deprecate
interface
> > >>>>>>>>>> `WindowStoreIterator`.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator
> > >>>>>>>>>>
> > >>>>>>>>>> Looking forward to your feedback.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> -Matthias
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message