kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolay Izhikov <nizhi...@apache.org>
Subject Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times
Date Thu, 23 Aug 2018 13:58:46 GMT
Hello, Mathias.

Thanks for your feedback.

> Thus, it might make sense to keep old and just add new ones? 

As far as I understand, we will keep old methods anyway to prevent public API backward compatibility.
I agree with you, methods that used internally shouldn't be deprecated.

> End users can use the "nicer" new ones, while we can still use the existing ones internally?

> Not sure if it would be possible to keep the old ones without exposing them as public
API?

I think, when we decide to remove methods with `long` from public API, we can do the following:

1. Create an interface like `WindowsInternal`.
2. Change Windows to an interface.
3. Create package-private implementation `WindowsImpl`.

```
	package org.apache.kafka.streams.kstream.internals;
	public interface WindowsInternal {
		public long start();
		public long end();
		//etc...
	}

	package org.apache.kafka.streams.kstream;
	public interface Windows<W extends Window> {
		public Instant start();
		public Instant end();
		//...
	}

	class WindowsImpl<W extends Window> implements Windows<W>, WindowsInternal {

	}
```

So, in public API we will expose only `Windows` interface and internally we can use `WindowsInternal`
But, of course, this will be huge changes in public API.

> Let me know what you think about this.

I think in this KIP we shouldn't deprecate methods, that are used internally.
I changed it, now my proposal is just add new methods.

Please, let me know if anything more need to be done.

В Ср, 22/08/2018 в 17:29 -0700, Matthias J. Sax пишет:
> Thanks a lot for the KIP.
> 
> From my understanding, the idea of the KIP is to improve the public API
> at DSL level. However, not all public methods listed are part of DSL
> level API, but part of runtime API. Those methods are called during
> processing and are on the hot code path. I am not sure, if we want to
> update those methods. We should carefully think about this, and consider
> to keep Long/long type to keep runtime overhead small. Note, that the
> methods I mention are not required to specify a program using the DSL
> and thus is questionable if the DSL API would be improved if we change
> the methods.
> 
> It's unfortunate, that some part of the public API stretch the DSL
> builder part as well as the runtime part...
> 
> This affects the following methods (please double check if I missed any):
> 
>  - Windows#windowsFor()
>  - Window#start()
>  - Window#end()
>  - JoinWindows#windowFor()
>  - SessionWindows#inactivitiyGap()
>  - TimeWindows#windowFor()
>  - UnlimitedWindows#windowFor()
>  - ProcessorContext#schedule()
>  - ReadOnlyWindowStore#fetch() (2x) and #fetchAll()
>  - SessionStore#findSessions() (2x)
> 
> maybe
>  - TimeWindowedDeserializer#getWindowSize() (it's unused atm, but I
> could imagine that it might be use on the hot code path in the furture)
> 
> So methods have "dual" use and might be called externally and internally:
> 
>  - Window#start()
>  - Window#end()
>  - ReadOnlyWindowStore#fetch() (2x) and #fetchAll()
>  - SessionStore#findSessions() (2x)
> 
> Thus, it might make sense to keep old and just add new ones? End users
> can use the "nicer" new ones, while we can still use the existing ones
> internally? Not sure if it would be possible to keep the old ones
> without exposing them as public API?
> 
> Let me know what you think about this.
> 
> 
> -Matthias
> 
> 
> 
> On 8/21/18 11:41 PM, Nikolay Izhikov wrote:
> > Dear, commiters.
> > 
> > Please, pay attention to this KIP and share your opinion.
> > 
> > В Вт, 21/08/2018 в 11:14 -0500, John Roesler пишет:
> > > I'll solicit more reviews. Let's get at least one committer to chime in
> > > before we start a vote (since we need their approval anyway).
> > > -John
> > > 
> > > On Mon, Aug 20, 2018 at 12:39 PM Nikolay Izhikov <nizhikov@apache.org>
> > > wrote:
> > > 
> > > > Hello, Ted.
> > > > 
> > > > Thanks for the comment.
> > > > 
> > > > I've edit KIP and change proposal to `windowSize`.
> > > > 
> > > > Guys, any other comments?
> > > > 
> > > > 
> > > > В Вс, 19/08/2018 в 14:57 -0700, Ted Yu пишет:
> > > > > bq. // or just Duration windowSize();
> > > > > 
> > > > > +1 to the above choice.
> > > > > The duration is obvious from the return type. For getter methods,
we
> > > > 
> > > > don't
> > > > > use get as prefix (as least for new code).
> > > > > 
> > > > > Cheers
> > > > > 
> > > > > On Sun, Aug 19, 2018 at 8:03 AM Nikolay Izhikov <nizhikov@apache.org>
> > > > 
> > > > wrote:
> > > > > 
> > > > > > Hello, John.
> > > > > > 
> > > > > > Thank you very much for your feedback!
> > > > > > I've addressed all your comments.
> > > > > > Please, see my answers and let my know is anything in KIP [1]
needs to
> > > > 
> > > > be
> > > > > > improved.
> > > > > > 
> > > > > > > The correct choice is actually "Instant", not> "LocalDateTime"
> > > > > > 
> > > > > > I've changed the methods proposed in KIP [1] to use Instant.
> > > > > > 
> > > > > > > I noticed some recent APIs are> missing (see KIP-328)
> > > > > > > those APIs were just added and have never been released...
you can
> > > > 
> > > > just
> > > > > > 
> > > > > > replace them.
> > > > > > 
> > > > > > I've added new methods to KIP [1].
> > > > > > Not released methods marked for remove.
> > > > > > 
> > > > > > > any existing method that's already deprecated, don't bother
> > > > > > 
> > > > > > transitioning it to Duration.
> > > > > > 
> > > > > > Fixed.
> > > > > > 
> > > > > > > IllegalArgumentException... we should plan to mention this
in the
> > > > > > 
> > > > > > javadoc for those methods.
> > > > > > 
> > > > > > Got it.
> > > > > > 
> > > > > > > In Stores, windowSize and segmentInterval should also be
durations.
> > > > > > 
> > > > > > Fixed.
> > > > > > 
> > > > > > > StreamsMetrics, recordLatency ... this one is better left
alone.
> > > > > > 
> > > > > > OK. I removed this method from KIP [1].
> > > > > > 
> > > > > > Two more questions question about implementation:
> > > > > > 
> > > > > > 1. We have serveral methods without parameters.
> > > > > > In java we can't have two methods with parameters with the same
name.
> > > > > > It wouldn't compile.
> > > > > > So we have to rename new methods. Please, see suggested names
and share
> > > > > > your thoughts:
> > > > > > 
> > > > > > Windows {
> > > > > >     long size() -> Duration windowSize();
> > > > > > }
> > > > > > 
> > > > > > Window {
> > > > > >     long start() -> Instant startTime();
> > > > > >     long end() -> Instant endTime();
> > > > > > }
> > > > > > 
> > > > > > SessionWindows {
> > > > > >     long inactivityGap() -> Duration inactivityGapDuration();
> > > > > > }
> > > > > > 
> > > > > > TimeWindowedDeserializer {
> > > > > >     Long getWindowSize() -> Duration getWindowSizeDuration();
// or
> > > > 
> > > > just
> > > > > > Duration windowSize();
> > > > > > }
> > > > > > 
> > > > > > SessionBytesStoreSupplier {
> > > > > >     long retentionPeriod() -> Duration retentionPeriodDuration();
> > > > > > }
> > > > > > 
> > > > > > WindowBytesStoreSupplier {
> > > > > >     long windowSize() -> Duration windowSizeDuration();
> > > > > >     long retentionPeriod() -> Duration retentionPeriodDuration();
> > > > > > }
> > > > > > 
> > > > > > 2. Do we want to use Duration and Instant inside API implementations?
> > > > > > 
> > > > > > IGNITE-7277: "Durations potentially worsen memory pressure and
gc
> > > > > > performance, so internally, we will still use longMs as the
> > > > 
> > > > representation."
> > > > > > IGNITE-7222: Duration used to store retention.
> > > > > > 
> > > > > > [1]
> > > > > > 
> > > > 
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times
> > > > > > [2]
> > > > > > 
> > > > 
> > > > https://github.com/apache/kafka/commit/b3771ba22acad7870e38ff7f58820c5b50946787#diff-47289575d3e3e2449f27b3a7b6788e1aR64
> > > > > > 
> > > > > > В Пт, 17/08/2018 в 14:46 -0500, John Roesler пишет:
> > > > > > > Hi Nikolay,
> > > > > > > 
> > > > > > > Thanks for this very nice KIP!
> > > > > > > 
> > > > > > > To answer your questions:
> > > > > > > 1. Correct, we should not delete existing methods that
have been
> > > > > > 
> > > > > > released,
> > > > > > > but ...
> > > > > > > 
> > > > > > > 2. Yes, we should deprecate the 'long' variants so that
we can drop
> > > > 
> > > > them
> > > > > > > later on. Personally, I like to mention which version deprecated
the
> > > > > > 
> > > > > > method
> > > > > > > so everyone can see later on how long it's been deprecated,
but this
> > > > 
> > > > may
> > > > > > 
> > > > > > be
> > > > > > > controversial, so let's let other weigh in.
> > > > > > > 
> > > > > > > 3. I think you're asking whether it's appropriate to drop
the "Ms"
> > > > > > 
> > > > > > suffix,
> > > > > > > and I think yes. So "long inactivityGapMs" would become
"Duration
> > > > > > > inactivityGap".
> > > > > > > In the places where the parameter's name is just "duration",
I think
> > > > 
> > > > we
> > > > > > 
> > > > > > can
> > > > > > > pick something more descriptive (I realize it was already
> > > > 
> > > > "durationMs";
> > > > > > > this is just a good time to improve it).
> > > > > > > Also, you're correct that we shouldn't use a Duration to
represent a
> > > > > > 
> > > > > > moment
> > > > > > > in time, like "startTime". The correct choice is actually
"Instant",
> > > > 
> > > > not
> > > > > > > "LocalDateTime", though.
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > 
> > > > https://stackoverflow.com/questions/32437550/whats-the-difference-between-instant-and-localdatetime
> > > > > > > explains why.
> > > > > > > 
> > > > > > > I also had a few notes on the KIP itself:
> > > > > > > 4. You might want to pull trunk again. I noticed some recent
APIs are
> > > > > > > missing (see KIP-328).
> > > > > > > 
> > > > > > > 5. Speaking of KIP-328: those APIs were just added and
have never
> > > > 
> > > > been
> > > > > > > released, so there's no need to deprecate the methods,
you can just
> > > > > > 
> > > > > > replace
> > > > > > > them.
> > > > > > > 
> > > > > > > 6. For any existing method that's already deprecated, don't
bother
> > > > > > > transitioning it to Duration. I think the examples I noticed
were
> > > > > > > deprecated in KIP-328, so you'll see what I'm talking about
when you
> > > > 
> > > > pull
> > > > > > > trunk again.
> > > > > > > 
> > > > > > > 7. Any method taking a Duration argument may throw an
> > > > > > > IllegalArgumentException (we choose to convert ArithmeticException
to
> > > > > > > IllegalArgumentException, as I mentioned in the Jira ticket).
We
> > > > 
> > > > don't
> > > > > > 
> > > > > > need
> > > > > > > a "throws" declaration, but we should plan to mention this
in the
> > > > 
> > > > javadoc
> > > > > > > for those methods.
> > > > > > > 
> > > > > > > 8. In Stores, windowSize and segmentInterval should also
be
> > > > 
> > > > durations.
> > > > > > > 
> > > > > > > 9. In StreamsMetrics, recordLatency could be just a Duration,
but I
> > > > > > > actually think this one is better left alone. IMO, it's
more effort
> > > > 
> > > > for
> > > > > > > little gain to require users to construct a Duration before
they
> > > > 
> > > > call the
> > > > > > > method, since they vary likely call System.currentTimeNanos
before
> > > > 
> > > > and
> > > > > > > after the code in question.
> > > > > > > 
> > > > > > > These are quite a few notes, but they're all minor. Overall
the KIP
> > > > 
> > > > looks
> > > > > > > really good to me. Thanks for picking this up!
> > > > > > > -John
> > > > > > > 
> > > > > > > On Thu, Aug 16, 2018 at 9:55 AM Nikolay Izhikov <nizhikov@apache.org
> > > > > > 
> > > > > > wrote:
> > > > > > > 
> > > > > > > > Hello, Kafka developers.
> > > > > > > > 
> > > > > > > > I would like to start a discussion of KIP-358 [1].
> > > > > > > > It based on a ticket KAFKA-7277 [2].
> > > > > > > > 
> > > > > > > > I crawled through Stream API and made my suggestions
for API
> > > > 
> > > > changes.
> > > > > > > > 
> > > > > > > > I have several questions about changes.
> > > > > > > > Please, share your comments:
> > > > > > > > 
> > > > > > > > 1. I propose do not remove existing API methods with
long ms
> > > > > > 
> > > > > > parameters.
> > > > > > > > Is it correct?
> > > > > > > > 
> > > > > > > > 2. Should we mark existing methods as deprecated?
> > > > > > > > 
> > > > > > > > 3. Suggested changes in ticket description are `long
durationMs` to
> > > > > > > > `Duration duration` and similar.
> > > > > > > > I suggest to change 'long startTimeMs` to `LocalDateTime
startTime`
> > > > > > 
> > > > > > also.
> > > > > > > > Should we do it?
> > > > > > > > 
> > > > > > > > Please, note, it very first KIP for me, so tell me
if I miss
> > > > 
> > > > something
> > > > > > > > obvious for experienced Kafka developers.
> > > > > > > > 
> > > > > > > > [1]
> > > > > > > > 
> > > > > > 
> > > > > > 
> > > > 
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times
> > > > > > > > [2] https://issues.apache.org/jira/browse/KAFKA-7277
> 
> 
Mime
View raw message