kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Karvanen <jukka.karva...@jukinimi.com>
Subject Re: [DISCUSS] KIP-470: TopologyTestDriver test input and output usability improvements
Date Tue, 02 Jul 2019 09:03:03 GMT
Hi Matthias,

Generally I think using Instant and Duration make the test more readable
and that's why I modified KIP based on your suggestion.
Now a lot of code use time with long or Long and that make the change more
complicated.

What I tried to say about the migration is the lines without timestamp or
if long timestamp is supported can be migrated mainly with search &
recplace:

testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTopic,
nullKey, "Hello", 1L));

->

*inputTopic*.pipeInput(nullKey, "Hello", 1L);

If long is not supported as timestamp, the same is not so easy:

testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTopic,
nullKey, "Hello", 1L));

->

*inputTopic1*.pipeInput(nullKey, "Hello", Instant.ofEpochMilli(1L));

Also if you need to convert arbitrary long timestamps to proper time
Instants, it require you need to understand the logic of the test. So
mechanical search & replace is not possible.


I see there are several alternatives for long vs Instant / Duration:

1. All times as long/Long like in this version:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=119550011

(startTimestampMs, autoAdvanceMs as parameter of  createInputTopic
instead of configureTiming)

2. Auto timestamping configured with Instant and Duration, pipeInput
and TestRecord with long:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120722523


3. (CURRENT) Auto timestamping configured with Instant and Duration,
pipeInput and TestRecord with Instant, version with long deprecated:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements


4. Auto timestamping configured with Instant and Duration, pipeInput
and TestRecord with Instant and long parallel (with long not
deprecated):

5. Auto timestamping configured with Instant and Duration, pipeInput
and TestRecord with Instant only

I hope to get feedback.

My own preference currently is alternative 3. or 4.


If somebody want to test, there is a implementation of this version
available in Github:

https://github.com/jukkakarvanen/kafka-streams-test-topics

which can be used directly from public Maven repository:

    <dependency>
        <groupId>com.github.jukkakarvanen</groupId>
        <artifactId>kafka-streams-test-topics</artifactId>
        <version>0.0.1-beta3</version>
        <scope>test</scope>
    </dependency>

Also is this approach in KIP-470 preferred over KIP-456, so can we close it:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver

Jukka

.


pe 28. kesäk. 2019 klo 1.10 Matthias J. Sax (matthias@confluent.io)
kirjoitti:

> Thanks Jukka!
>
> The idea to use `Instant/Duration` was a proposal. If we think it's not
> a good one, we could still stay with `long`. Because `ProducerRecord`
> and `ConsumerRecord` are both based on `long,` it might make sense to
> keep `long`?
>
> > The result of converting millis to Instant directly generates
> >> rather non readable test code and changing from long to Instant
> correctly
> >> require understand what is the case it is testing.
>
> This might be a good indicator the using `Instant/Duration` might not be
> a good idea.
>
> Would be nice to get feedback from others.
>
> About adding new methods that we deprecate immediately: I don't think we
> should do this. IMHO, there are two kind of users, one that immediately
> rewrite their code to move off deprecated methods. Those won't use the
> new+deprecated ones anyway. Other uses migrate their code slowly and
> would just not rewrite their code at all, and thus also not use the
> new+deprecated methods.
>
> > Checking my own tests I was able to migrate the most of my code with
> > search&replace without further thinking about the logic to this new
> > approach. The result of converting millis to Instant directly generates
> > rather non readable test code and changing from long to Instant correctly
> > require understand what is the case it is testing.
>
> Not sure if I can follow here. You first say, you could easily migrate
> your code, but than you say it was not easily possible? Can you clarify
> your experience upgrading your test code?
>
>
> -Matthias
>
>
> On 6/27/19 12:21 AM, Jukka Karvanen wrote:
> > Hi,
> >
> >>> (4) Should we switch from `long` for timestamps to `Instant` and
> > `Duration` ?
> >> This version startTimestamp is Instant and autoAdvance Duration in
> > Initialization and with manual configured collection pipe methods.
> >> Now timestamp of TestRecord is still Long and similarly single record
> > pipeInput still has long as parameter.
> >> Should these also converted to to Instant type?
> >> Should there be both long and Instant parallel?
> >> I expect there are existing test codebase which would be easier to
> migrate
> > if long could be still used.
> > Now added Instant version to TestRecord and pipeInput method.
> >
> > Checking my own tests I was able to migrate the most of my code with
> > search&replace without further thinking about the logic to this new
> > approach. The result of converting millis to Instant directly generates
> > rather non readable test code and changing from long to Instant correctly
> > require understand what is the case it is testing.
> >
> > That is why version with long left still as deprecated for easier
> migration
> > for existing test.
> > Also TopologyTestDriver constructor and advanceWallClockTime  method
> > modified with same approach.
> >
> > Jukka
> >
> >
> > ma 24. kesäk. 2019 klo 16.47 Bill Bejeck (bbejeck@gmail.com) kirjoitti:
> >
> >> Hi Jukka
> >>
> >>> These topic objects are only interfacing TopologyTestDriver, not
> >> affecting
> >>> the internal functionality of it. In my plan the internal data
> structures
> >>> are using those Producer/ConsumerRecords as earlier. That way I don't
> see
> >>> how those could be affected.
> >>
> >> I mistakenly thought the KIP was proposing to completely remove
> >> Producer/ConsumerRecords in favor of TestRecord.  But taking another
> quick
> >> look I can see the plan for using the TestRecord objects.  Thanks for
> the
> >> clarification.
> >>
> >> -Bill
> >>
> >> On Sat, Jun 22, 2019 at 2:26 AM Jukka Karvanen <
> >> jukka.karvanen@jukinimi.com>
> >> wrote:
> >>
> >>> Hi All,
> >>> Hi Matthias,
> >>>
> >>>> (1) It's a little confusing that you list all method (existing,
> proposed
> >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP. Maybe
> >>>> only list the ones you propose to deprecate and the new ones you want
> to
> >>>> add?
> >>>
> >>> Ok. Unmodified methods removed.
> >>>
> >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to add
> >>>> overload to initialize the timetamp and auto-advance feature directly?
> >>>> Otherwise, uses always need to call `configureTiming` as an extra
> call?
> >>>
> >>> Added with Instant and Duration parameters.
> >>>
> >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to
> >>> `reconfigureTiming()` ?
> >>>
> >>> I removed this method when we have now initialization in constructor.
> >>> You can recreate TestInputTopic if needing to reconfigure timing.
> >>>
> >>>
> >>>> (4) Should we switch from `long` for timestamps to `Instant` and
> >>> `Duration` ?
> >>> This version startTimestamp is Instant and autoAdvance Duration in
> >>> Initialization and with manual configured collection pipe methods.
> >>> Now timestamp of TestRecord is still Long and similarly single record
> >>> pipeInput still has long as parameter.
> >>> Should these also converted to to Instant type?
> >>> Should there be both long and Instant parallel?
> >>> I expect there are existing test codebase which would be easier to
> >> migrate
> >>> if long could be still used.
> >>>
> >>> Also should advanceWallClockTime  in TopologyTestDriver changed(or
> added
> >>> alternative) for Duration parameter.
> >>>
> >>>
> >>>> (5) Why do we have redundant getters? Or set with `getX()` and one
> >>> set without `get`-prefix?
> >>>
> >>> The methods without get-prefix are for compatibility with
> >> ProducerRecord /
> >>> ConsumerRecord and I expect would make migration to TestRecord easier.
> >>> Standard getters in TestRecord enable writing test ignoring selected
> >> fields
> >>> with hamcrest like this:
> >>>
> >>> assertThat(outputTopic.readRecord(), allOf(
> >>>         hasProperty("key", equalTo(1L)),
> >>>         hasProperty("value", equalTo("Hello")),
> >>>         hasProperty("headers", equalTo(headers))));
> >>>
> >>>
> >>> That's why I have currently both methods.
> >>>
> >>> Jukka
> >>>
> >>>
> >>> pe 21. kesäk. 2019 klo 22.20 Matthias J. Sax (matthias@confluent.io)
> >>> kirjoitti:
> >>>
> >>>> Thanks for the KIP. The idea to add InputTopic and OutputTopic
> >>>> abstractions is really neat!
> >>>>
> >>>>
> >>>> Couple of minor comment:
> >>>>
> >>>> (1) It's a little confusing that you list all method (existing,
> >> proposed
> >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP. Maybe
> >>>> only list the ones you propose to deprecate and the new ones you want
> >> to
> >>>> add?
> >>>>
> >>>> (Or mark all existing methods clearly -- atm, I need to got back to
> the
> >>>> code to read the KIP and to extract what changes are proposed).
> >>>>
> >>>>
> >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to add
> >>>> overload to initialize the timetamp and auto-advance feature directly?
> >>>> Otherwise, uses always need to call `configureTiming` as an extra
> call?
> >>>>
> >>>>
> >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to
> >>>> `reconfigureTiming()` ?
> >>>>
> >>>>
> >>>> (4) Should we switch from `long` for timestamps to `Instant` and
> >>>> `Duration` ?
> >>>>
> >>>>
> >>>> (5) Why do we have redundant getters? Or set with `getX()` and one set
> >>>> without `get`-prefix?
> >>>>
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 6/21/19 10:57 AM, Bill Bejeck wrote:
> >>>>> Jukka,
> >>>>>
> >>>>> Thanks for the KIP. I like the changes overall.
> >>>>> One thing I wanted to confirm, and this may be me being paranoid, but
> >>>> will
> >>>>> the changes for input/output topic affect how the TopologyTestDriver
> >>>> works
> >>>>> with internal topics when there are sub-topologies created?
> >>>>>
> >>>>> On Fri, Jun 21, 2019 at 12:05 PM Guozhang Wang <wangguoz@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>>> 1) Got it, could you list this class along with all its functions in
> >>> the
> >>>>>> proposed public APIs as well?
> >>>>>>
> >>>>>> 2) Ack, thanks!
> >>>>>>
> >>>>>> On Thu, Jun 20, 2019 at 11:27 PM Jukka Karvanen <
> >>>>>> jukka.karvanen@jukinimi.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi  Guozhang,
> >>>>>>>
> >>>>>>> 1) This TestRecord is new class in my proposal. So it is a
> >> simplified
> >>>>>>> version of ProducerRecord and ConsumerRecord containing only the
> >>> fields
> >>>>>>> needed to test record content.
> >>>>>>>
> >>>>>>> 2)
> >>>>>>> public final <K, V> TestInputTopic<K, V> createInputTopic(final
> >>> String
> >>>>>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
> >>>>>>> public final <K, V> TestOutputTopic<K, V> createOutputTopic(final
> >>>> String
> >>>>>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
> >>>>>>> The purpose is to create separate object for each input and output
> >>>> topic
> >>>>>>> you are using. The topic name is given to createInput/OutputTopic
> >>> when
> >>>>>>> initialize topic object.
> >>>>>>>
> >>>>>>> For example:
> >>>>>>>
> >>>>>>> final TestInputTopic<Long, String> inputTopic1 =
> >>>>>>> testDriver.createInputTopic( INPUT_TOPIC, longSerde, stringSerde);
> >>>>>>> final TestInputTopic<Long, String> inputTopic2 =
> >>>>>>> testDriver.createInputTopic( INPUT_TOPIC_MAP, longSerde,
> >>> stringSerde);
> >>>>>>> final TestOutputTopic<Long, String> outputTopic1 =
> >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC, longSerde, stringSerde);
> >>>>>>> final TestOutputTopic<String, Long> outputTopic2 =
> >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC_MAP, stringSerde,
> >>>>>>> longSerde);
> >>>>>>> inputTopic1.pipeInput(1L, "Hello");
> >>>>>>> assertThat(outputTopic1.readKeyValue(), equalTo(new KeyValue<>(1L,
> >>>>>>> "Hello")));
> >>>>>>> assertThat(outputTopic2.readKeyValue(), equalTo(new
> >>> KeyValue<>("Hello",
> >>>>>>> 1L)));
> >>>>>>> inputTopic2.pipeInput(1L, "Hello");
> >>>>>>>
> >>>>>>>
> >>>>>>> Jukka
> >>>>>>>
> >>>>>>> to 20. kesäk. 2019 klo 23.52 Guozhang Wang (wangguoz@gmail.com)
> >>>>>> kirjoitti:
> >>>>>>>
> >>>>>>>> Hello Jukka,
> >>>>>>>>
> >>>>>>>> Thanks for writing the KIP, I have a couple of quick questions:
> >>>>>>>>
> >>>>>>>> 1) Is "TestRecord" an existing class that you propose to
> >> piggy-back
> >>>> on?
> >>>>>>>> Right now we have a scala TestRecord case class but I doubt that
> >> was
> >>>>>> your
> >>>>>>>> proposal, or are you proposing to add a new Java class?
> >>>>>>>>
> >>>>>>>> 2) Would the new API only allow a single input / output topic with
> >>>>>>>> `createInput/OutputTopic`? If not, when we call pipeInput how to
> >>>>>>> determine
> >>>>>>>> which topic this record should be pipe to?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Guozhang
> >>>>>>>>
> >>>>>>>> On Mon, Jun 17, 2019 at 1:34 PM John Roesler <john@confluent.io>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Woah, I wasn't aware of that Hamcrest test style. Awesome!
> >>>>>>>>>
> >>>>>>>>> Thanks for the updates. I look forward to hearing what others
> >>> think.
> >>>>>>>>>
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Mon, Jun 17, 2019 at 4:12 AM Jukka Karvanen
> >>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Wiki page updated:
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ClientRecord removed and replaced with TestRecord in method
> >> calls.
> >>>>>>>>>> TestRecordFactory removed (time tracking functionality to be
> >>>>>> included
> >>>>>>>> to
> >>>>>>>>>> TestInputTopic)
> >>>>>>>>>> OutputVerifier deprecated
> >>>>>>>>>> TestRecord topic removed and getters added
> >>>>>>>>>>
> >>>>>>>>>> Getters in TestRecord enable writing test ignoring selected
> >> fields
> >>>>>>> with
> >>>>>>>>>> hamcrest like this:
> >>>>>>>>>>
> >>>>>>>>>> assertThat(outputTopic.readRecord(), allOf(
> >>>>>>>>>>         hasProperty("key", equalTo(1L)),
> >>>>>>>>>>         hasProperty("value", equalTo("Hello")),
> >>>>>>>>>>         hasProperty("headers", equalTo(headers))));
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Jukka
> >>>>>>>>>>
> >>>>>>>>>> la 15. kesäk. 2019 klo 1.10 John Roesler (john@confluent.io)
> >>>>>>>> kirjoitti:
> >>>>>>>>>>
> >>>>>>>>>>> Sounds good. Thanks as always for considering my feedback!
> >>>>>>>>>>> -John
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Jun 14, 2019 at 12:12 PM Jukka Karvanen
> >>>>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Ok, I will modify KIP Public Interface in a wiki based on the
> >>>>>>>>> feedback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> TestRecordFactory / ConsumerRecordFactory was used by
> >>>>>>>> TestInputTopic
> >>>>>>>>> with
> >>>>>>>>>>>> the version I had with KIP456, but maybe I can merge That
> >>>>>>>>> functionality
> >>>>>>>>>>> to
> >>>>>>>>>>>> InputTopic or  TestRecordFactory   can kept non public maybe
> >>>>>>> moving
> >>>>>>>>> it to
> >>>>>>>>>>>> internals package.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will make the proposal with a slim down interface.
> >>>>>>>>>>>> I don't want to go to so slim as you proposed with only
> >>>>>>> TestRecord
> >>>>>>>> or
> >>>>>>>>>>>> List<TestRecord>, because you then still end up doing helper
> >>>>>>>> methods
> >>>>>>>>> to
> >>>>>>>>>>>> construct List of TestRecord.
> >>>>>>>>>>>> The list of values is easier to write and clearer to read than
> >>>>>> if
> >>>>>>>> you
> >>>>>>>>>>> need
> >>>>>>>>>>>> to contruct list of TestRecords.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For example:
> >>>>>>>>>>>>
> >>>>>>>>>>>> final List<String> inputValues = Arrays.asList(
> >>>>>>>>>>>>         "Apache Kafka Streams Example",
> >>>>>>>>>>>>         "Using Kafka Streams Test Utils",
> >>>>>>>>>>>>         "Reading and Writing Kafka Topic"
> >>>>>>>>>>>> );
> >>>>>>>>>>>> inputTopic.pipeValueList(inputValues);
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Let's check after the next iteration is it still worth
> >> reducing
> >>>>>>> the
> >>>>>>>>>>> methods.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Jukka
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> pe 14. kesäk. 2019 klo 18.27 John Roesler (john@confluent.io)
> >>>>>>>>> kirjoitti:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks, Jukka,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Ok, I buy this reasoning.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Just to echo what I think I read, you would just drop
> >>>>>>>> ClientRecord
> >>>>>>>>>>>>> from the proposal, and TestRecord would stand on its own,
> >>>>>> with
> >>>>>>>> the
> >>>>>>>>>>>>> same methods and properties you proposed, and the "input
> >>>>>> topic"
> >>>>>>>>> would
> >>>>>>>>>>>>> accept TestRecord, and the "output topic" would produce
> >>>>>>>> TestRecord?
> >>>>>>>>>>>>> Further, the "input and output topic" classes would
> >>>>>> internally
> >>>>>>>>> handle
> >>>>>>>>>>>>> the conversion to and from ConsumerRecord and ProducerRecord
> >>>>>> to
> >>>>>>>>> pass
> >>>>>>>>>>>>> to and from the TopologyTestDriver?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This seems good to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Since the object coming out of the "output topic" is much
> >>>>>> more
> >>>>>>>>>>>>> ergonomic, I suspect we won't need the OutputVerifier at all.
> >>>>>>> It
> >>>>>>>>> was
> >>>>>>>>>>>>> mostly needed because of all the extra junk in ProducerRecord
> >>>>>>> you
> >>>>>>>>> need
> >>>>>>>>>>>>> to ignore. It seems better to just deprecate it. If in the
> >>>>>>> future
> >>>>>>>>> it
> >>>>>>>>>>>>> turns out there is some actual use case for a verifier, we
> >>>>>> can
> >>>>>>>>> have a
> >>>>>>>>>>>>> very small KIP to add one. But reading your response, I
> >>>>>> suspect
> >>>>>>>>> that
> >>>>>>>>>>>>> existing test verification libraries would be sufficient on
> >>>>>>> their
> >>>>>>>>> own.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Similarly, it seems like we can shrink the total interface by
> >>>>>>>>> removing
> >>>>>>>>>>>>> the TestRecordFactory from the proposal. If TestRecord
> >>>>>> already
> >>>>>>>>> offers
> >>>>>>>>>>>>> all the constructors you'd want, then the only benefit of the
> >>>>>>>>> factory
> >>>>>>>>>>>>> is to auto-increment the timestamps, but then again, the
> >>>>>> "input
> >>>>>>>>> topic"
> >>>>>>>>>>>>> can already do that (e.g., it can do it if the record
> >>>>>> timestamp
> >>>>>>>> is
> >>>>>>>>> not
> >>>>>>>>>>>>> set).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Likewise, if the TestRecords are easy to create, then we
> >>>>>> don't
> >>>>>>>> need
> >>>>>>>>>>>>> all the redundant methods in "input topic" to pipe values, or
> >>>>>>>>>>>>> key/values, or key/value/timestamp, etc. We can do with just
> >>>>>>> two
> >>>>>>>>>>>>> methods, one for a single TestRecord and one for a collection
> >>>>>>> of
> >>>>>>>>> them.
> >>>>>>>>>>>>> This reduces API ambiguity and also dramatically decreases
> >>>>>> the
> >>>>>>>>> surface
> >>>>>>>>>>>>> area of the interface, which ultimately makes it much easier
> >>>>>> to
> >>>>>>>>> use.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> It's best to start with the smallest interface that will do
> >>>>>> the
> >>>>>>>> job
> >>>>>>>>>>>>> and expand it upon request, rather than throwing in
> >>>>>> everything
> >>>>>>>> you
> >>>>>>>>> can
> >>>>>>>>>>>>> think of up front. The extra stuff would be confusing to
> >>>>>> people
> >>>>>>>>> facing
> >>>>>>>>>>>>> two practically identical paths to accomplish the same goal,
> >>>>>>> and
> >>>>>>>>> it's
> >>>>>>>>>>>>> very difficult to slim the interface down later, because we
> >>>>>>> don't
> >>>>>>>>>>>>> really know which parts are more popular (i.e., no one
> >>>>>> submits
> >>>>>>>>>>>>> "feature requests" to _remove_ stuff they don't need, only to
> >>>>>>>> _add_
> >>>>>>>>>>>>> stuff that they need.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But overall, I really like the structure of this design. I'm
> >>>>>>>> super
> >>>>>>>>>>>>> excited about this KIP.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> -John
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Fri, Jun 14, 2019 at 2:55 AM Jukka Karvanen
> >>>>>>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I am not a fan of swapping only ProducerRecord and
> >>>>>>>>> ConsumerRecord.
> >>>>>>>>>>>>>> As a test writer point of view I do not want to care about
> >>>>>>> the
> >>>>>>>>>>> difference
> >>>>>>>>>>>>>> of those and
> >>>>>>>>>>>>>> that way I like to have object type which can be used to
> >>>>>> pipe
> >>>>>>>>>>> records in
> >>>>>>>>>>>>>> and compare outputs.
> >>>>>>>>>>>>>> That way avoid unnecessary conversions between
> >>>>>> ProducerRecord
> >>>>>>>> and
> >>>>>>>>>>>>>> ConsumerRecord.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> My initial assumption was that ProducerRecord and
> >>>>>>>>>>> ConsumerRecord.could
> >>>>>>>>>>>>>> implement the same ClientRecord
> >>>>>>>>>>>>>> and that way test writer could have used either of those,
> >>>>>> but
> >>>>>>>>> seems
> >>>>>>>>>>> that
> >>>>>>>>>>>>>> return type of timestamp method long vs Long is not
> >>>>>>> compatible.
> >>>>>>>>>>>>>> Now the main advantage of ClientRecord is no need to
> >>>>>>> duplicate
> >>>>>>>>>>>>>> OutputVerifier when it is modified from ProducerRecord to
> >>>>>>>>>>> ClientRecord.
> >>>>>>>>>>>>>> Generally is there need for OutputVerifier. Can we
> >>>>>> deprecate
> >>>>>>>> the
> >>>>>>>>>>> existing
> >>>>>>>>>>>>>> and use standard assertion libraries for new test.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If you use hamcrest, assert-j or any other assertion
> >>>>>> library
> >>>>>>>>> for the
> >>>>>>>>>>>>> rest
> >>>>>>>>>>>>>> of the test, why not use it with these also.
> >>>>>>>>>>>>>> When we have these methods to access only needed fields it
> >>>>>> is
> >>>>>>>>> easier
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> write test like this:
> >>>>>>>>>>>>>> assertThat(outputTopic.readValue()).isEqualTo("Hello");
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>
> >>>>>>> assertThat(outputTopic.readRecord()).isEqualTo(expectedRecord);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Only value for new OutputVerifier is when needing to ignore
> >>>>>>>> some
> >>>>>>>>>>> fields
> >>>>>>>>>>>>>> ClientRecord actual = outputTopic.readRecord();
> >>>>>>>>>>>>>> assertThat(actual.value()).isEqualTo("Hello");
> >>>>>>>>>>>>>> assertThat(actual.key()).isEqualTo(expectedKey);
> >>>>>>>>>>>>>>
> >>>>>> assertThat(actual.timestamp()).isEqualTo(expectedTimestamp);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So if want to leave client package untouched, I would
> >>>>>> modify
> >>>>>>>> the
> >>>>>>>>>>> methods
> >>>>>>>>>>>>>> with ClientRecord now in InputTopic and OutputTopic to pass
> >>>>>>> in
> >>>>>>>>> and
> >>>>>>>>>>> out
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>> TestRecord instead.
> >>>>>>>>>>>>>> In that case there would be possibility to add methods to
> >>>>>>>>> TestRecord
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> help ignore some fields in assertions like:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> assertThat(outputTopic.readRecord().getValueTimestamp()).isEqualTo(expectedRecord.get
> >>>>>>>>>>>>>> ValueTimestamp());
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> How about this alternative?
> >>>>>>>>>>>>>> If this way sounds better I will modify KIP page in wiki.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Jukka
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> to 13. kesäk. 2019 klo 18.30 John Roesler (
> >>>>>> john@confluent.io
> >>>>>>> )
> >>>>>>>>>>> kirjoitti:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hey, all, maybe we can jump-start this discussion.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think this approach would be very ergonomic for
> >>>>>> testing,
> >>>>>>>> and
> >>>>>>>>>>> would
> >>>>>>>>>>>>>>> help reduce boilerplate in tests.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The think I like most about it is that it mirrors the
> >>>>>>> mental
> >>>>>>>>> model
> >>>>>>>>>>>>>>> that people already have from Kafka Streams, in which you
> >>>>>>>>> write to
> >>>>>>>>>>> an
> >>>>>>>>>>>>>>> "input topic" and then get your results from an "output
> >>>>>>>>> topic". As
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> side benefit, the input and output topics in the proposal
> >>>>>>>> also
> >>>>>>>>>>> close
> >>>>>>>>>>>>>>> over the serdes, which makes it much less boilerplate for
> >>>>>>>> test
> >>>>>>>>>>> code.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If I can offer one suggestion for change, I'm not sure
> >>>>>> I'm
> >>>>>>>>> totally
> >>>>>>>>>>>>>>> sold on the need for a new abstraction "ClientRecord"
> >>>>>> with
> >>>>>>> an
> >>>>>>>>>>>>>>> implementation for tests "TestRecord". It seems like this
> >>>>>>>>>>>>>>> unnecessarily complicates the main (non-testing) data
> >>>>>>> model.
> >>>>>>>> It
> >>>>>>>>>>> seems
> >>>>>>>>>>>>>>> like it would be sufficient, and just as ergonomic, to
> >>>>>> have
> >>>>>>>> the
> >>>>>>>>>>> input
> >>>>>>>>>>>>>>> topic accept ProducerRecords and the output topic return
> >>>>>>>>>>>>>>> ConsumerRecords. I'm open to discussion on this point,
> >>>>>>>> though.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for the proposal, Jukka!
> >>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, May 20, 2019 at 7:59 AM Jukka Karvanen
> >>>>>>>>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would like to start the discussion on KIP-470:
> >>>>>>>>>>> TopologyTestDriver
> >>>>>>>>>>>>> test
> >>>>>>>>>>>>>>>> input and output usability improvements:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> This KIP is inspired by the Discussion in KIP-456:
> >>>>>> Helper
> >>>>>>>>>>> classes to
> >>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>> it simpler to write test logic with TopologyTestDriver:
> >>>>>>>>>>>>>>>>
> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> %3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The proposal in KIP-456
> >>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>>> to add alternate way to input and output topic, but
> >>>>>> this
> >>>>>>>> KIP
> >>>>>>>>>>> enhance
> >>>>>>>>>>>>>>> those
> >>>>>>>>>>>>>>>> classes and deprecate old functionality to make clear
> >>>>>>>>> interface
> >>>>>>>>>>> for
> >>>>>>>>>>>>> test
> >>>>>>>>>>>>>>>> writer to use.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In current KIP-470 proposal, topic objects are created
> >>>>>>> with
> >>>>>>>>>>>>> topicName and
> >>>>>>>>>>>>>>>> related serders.
> >>>>>>>>>>>>>>>>     public final <K, V> TestInputTopic<K, V>
> >>>>>>>>>>> createInputTopic(final
> >>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V>
> >>>>>>>>> valueSerde);
> >>>>>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K, V>
> >>>>>>>>>>> createOutputTopic(final
> >>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>> topicName, final Serde<K> keySerde, final Serde<V>
> >>>>>>>>> valueSerde);
> >>>>>>>>>>>>>>>> One thing I wondered if there way to find out topic
> >>>>>>> related
> >>>>>>>>> serde
> >>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>> TopologyTestDriver topology, it would simply creation
> >>>>>> of
> >>>>>>>>> these
> >>>>>>>>>>> Topic
> >>>>>>>>>>>>>>>> objects:
> >>>>>>>>>>>>>>>>     public final <K, V> TestInputTopic<K, V>
> >>>>>>>>>>> createInputTopic(final
> >>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>> topicName);
> >>>>>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K, V>
> >>>>>>>>>>> createOutputTopic(final
> >>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>> topicName);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> KIP-456 can be discarded if this KIP get accepted.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>> Jukka
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> -- Guozhang
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -- Guozhang
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message