From dev-return-106151-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Tue Jul 30 13:31:18 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id CE791180607 for ; Tue, 30 Jul 2019 15:31:17 +0200 (CEST) Received: (qmail 31326 invoked by uid 500); 30 Jul 2019 13:31:15 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 31313 invoked by uid 99); 30 Jul 2019 13:31:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jul 2019 13:31:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 059181A32FB for ; Tue, 30 Jul 2019 13:31:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.8 X-Spam-Level: * X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=confluent.io Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ODw5lyCYUdzN for ; Tue, 30 Jul 2019 13:31:06 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.128.44; helo=mail-wm1-f44.google.com; envelope-from=john@confluent.io; receiver= Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 8DA1CBC7E1 for ; Tue, 30 Jul 2019 13:31:05 +0000 (UTC) Received: by mail-wm1-f44.google.com with SMTP id u25so46305315wmc.4 for ; Tue, 30 Jul 2019 06:31:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=confluent.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ViO3EX4wfybkrktn4vCZMGwDwChIyeJjZ/vbR8rr+Ho=; b=oWAKkKOO8/h46jVRiyvnjN6DgWQEVDDF2wH3iu7GUEtLsQcEANll3aQWv7Kptzhstk 4mQm18AQS4mYcEwG8QCUp9i6iQaEM+olnXHgLRoZ7Mp7suovapMq0Wvftv0YWMOvwZWR zlo/UFljgyk1HUbLr5w5yiCAYB5YxX7JGMw4hl4IGuRxmeLe6ajW+HZCEkCQdWR5hNO4 9ccyB9vyKKzcgN6fLlUhEe62y9XCzjaWXk1by8LHaiuPlEpXim7dcz9UsPeBJelML7Bh pk8nCVDRIK9Pj2hG5JQfQPXeFi+kdBlsCXWEFTi8relolWOGnkaM80NC2eqQbVc8IVxv /gSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ViO3EX4wfybkrktn4vCZMGwDwChIyeJjZ/vbR8rr+Ho=; b=NTule33lJjgjhoumJqD1vE7zpX2cxN6YlLj9Rdl2u0rK2NN3ok3UCF8J+0VAXxuwi1 rxJsT0hrHa13Dsj1phaYeqA5ysq3naVEJbGIEvatyg2iiSJgpMXmaW/f1nhv2/50rUIW HK9KyPY0awMHhiOb2cDwY9q2kTF4VElRi+ysvDdXnBGDpZKlVymvHS5ODQXULlxf9fz2 dSoKDHfQNaSuGvqgjWqLSUC43u24E0PbqZ+E0LScQu1bkE2sbm1wXjYP8rel+1EjV6VP J/Ok1xMZ8OK/s4bAP440eAmDH87YcqBQMy1vq2H+DYTnUXrmAXWqtbHswrJWSP0JNbJG bbyw== X-Gm-Message-State: APjAAAVIY28bYj2KApuGFC8b5Jti4lJ3UxImf18Xd+0RaIh3SquPZ9zf /iVjQzmDy2+4q0k0buBPFGOYqOY7uOTfwqoctE8/0FHf X-Google-Smtp-Source: APXvYqxquZj7dNVzVbHzYIZB89kxSPBSXwPD40L+o86B+mhxjk5IFXhX1ozVDqRq1YZkx7WbLZkCkEZmTADwNcckiBs= X-Received: by 2002:a7b:c148:: with SMTP id z8mr1305977wmi.142.1564493457742; Tue, 30 Jul 2019 06:30:57 -0700 (PDT) MIME-Version: 1.0 References: <31f16c45-54bc-f6f3-2a94-368669decfed@confluent.io> <76823728-49ef-f06e-3459-73aa252933b9@confluent.io> In-Reply-To: From: John Roesler Date: Tue, 30 Jul 2019 08:30:45 -0500 Message-ID: Subject: Re: [DISCUSS] KIP-470: TopologyTestDriver test input and output usability improvements To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="000000000000cd81ff058ee6049a" --000000000000cd81ff058ee6049a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey Jukka, Sorry for the delay. For what it's worth, I think 3, 4, and 5 are all good options. I guess my own preference is 5. It seems like the migration pain is a one-time concern vs. having more maintainable code for years thereafter. Thanks, -John On Tue, Jul 2, 2019 at 4:03 AM Jukka Karvanen wrote: > Hi Matthias, > > Generally I think using Instant and Duration make the test more readable > and that's why I modified KIP based on your suggestion. > Now a lot of code use time with long or Long and that make the change mor= e > complicated. > > What I tried to say about the migration is the lines without timestamp or > if long timestamp is supported can be migrated mainly with search & > recplace: > > > testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTop= ic, > nullKey, "Hello", 1L)); > > -> > > *inputTopic*.pipeInput(nullKey, "Hello", 1L); > > If long is not supported as timestamp, the same is not so easy: > > > testDriver.pipeInput(recordFactory.create(WordCountLambdaExample.inputTop= ic, > nullKey, "Hello", 1L)); > > -> > > *inputTopic1*.pipeInput(nullKey, "Hello", Instant.ofEpochMilli(1L)); > > Also if you need to convert arbitrary long timestamps to proper time > Instants, it require you need to understand the logic of the test. So > mechanical search & replace is not possible. > > > I see there are several alternatives for long vs Instant / Duration: > > 1. All times as long/Long like in this version: > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=3D119550= 011 > > (startTimestampMs, autoAdvanceMs as parameter of createInputTopic > instead of configureTiming) > > 2. Auto timestamping configured with Instant and Duration, pipeInput > and TestRecord with long: > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=3D120722= 523 > > > 3. (CURRENT) Auto timestamping configured with Instant and Duration, > pipeInput and TestRecord with Instant, version with long deprecated: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTest= Driver+test+input+and+output+usability+improvements > > > 4. Auto timestamping configured with Instant and Duration, pipeInput > and TestRecord with Instant and long parallel (with long not > deprecated): > > 5. Auto timestamping configured with Instant and Duration, pipeInput > and TestRecord with Instant only > > I hope to get feedback. > > My own preference currently is alternative 3. or 4. > > > If somebody want to test, there is a implementation of this version > available in Github: > > https://github.com/jukkakarvanen/kafka-streams-test-topics > > which can be used directly from public Maven repository: > > > com.github.jukkakarvanen > kafka-streams-test-topics > 0.0.1-beta3 > test > > > Also is this approach in KIP-470 preferred over KIP-456, so can we close > it: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+class= es+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver > > Jukka > > . > > > pe 28. kes=C3=A4k. 2019 klo 1.10 Matthias J. Sax (matthias@confluent.io) > kirjoitti: > > > Thanks Jukka! > > > > The idea to use `Instant/Duration` was a proposal. If we think it's not > > a good one, we could still stay with `long`. Because `ProducerRecord` > > and `ConsumerRecord` are both based on `long,` it might make sense to > > keep `long`? > > > > > The result of converting millis to Instant directly generates > > >> rather non readable test code and changing from long to Instant > > correctly > > >> require understand what is the case it is testing. > > > > This might be a good indicator the using `Instant/Duration` might not b= e > > a good idea. > > > > Would be nice to get feedback from others. > > > > About adding new methods that we deprecate immediately: I don't think w= e > > should do this. IMHO, there are two kind of users, one that immediately > > rewrite their code to move off deprecated methods. Those won't use the > > new+deprecated ones anyway. Other uses migrate their code slowly and > > would just not rewrite their code at all, and thus also not use the > > new+deprecated methods. > > > > > Checking my own tests I was able to migrate the most of my code with > > > search&replace without further thinking about the logic to this new > > > approach. The result of converting millis to Instant directly generat= es > > > rather non readable test code and changing from long to Instant > correctly > > > require understand what is the case it is testing. > > > > Not sure if I can follow here. You first say, you could easily migrate > > your code, but than you say it was not easily possible? Can you clarify > > your experience upgrading your test code? > > > > > > -Matthias > > > > > > On 6/27/19 12:21 AM, Jukka Karvanen wrote: > > > Hi, > > > > > >>> (4) Should we switch from `long` for timestamps to `Instant` and > > > `Duration` ? > > >> This version startTimestamp is Instant and autoAdvance Duration in > > > Initialization and with manual configured collection pipe methods. > > >> Now timestamp of TestRecord is still Long and similarly single recor= d > > > pipeInput still has long as parameter. > > >> Should these also converted to to Instant type? > > >> Should there be both long and Instant parallel? > > >> I expect there are existing test codebase which would be easier to > > migrate > > > if long could be still used. > > > Now added Instant version to TestRecord and pipeInput method. > > > > > > Checking my own tests I was able to migrate the most of my code with > > > search&replace without further thinking about the logic to this new > > > approach. The result of converting millis to Instant directly generat= es > > > rather non readable test code and changing from long to Instant > correctly > > > require understand what is the case it is testing. > > > > > > That is why version with long left still as deprecated for easier > > migration > > > for existing test. > > > Also TopologyTestDriver constructor and advanceWallClockTime method > > > modified with same approach. > > > > > > Jukka > > > > > > > > > ma 24. kes=C3=A4k. 2019 klo 16.47 Bill Bejeck (bbejeck@gmail.com) > kirjoitti: > > > > > >> Hi Jukka > > >> > > >>> These topic objects are only interfacing TopologyTestDriver, not > > >> affecting > > >>> the internal functionality of it. In my plan the internal data > > structures > > >>> are using those Producer/ConsumerRecords as earlier. That way I don= 't > > see > > >>> how those could be affected. > > >> > > >> I mistakenly thought the KIP was proposing to completely remove > > >> Producer/ConsumerRecords in favor of TestRecord. But taking another > > quick > > >> look I can see the plan for using the TestRecord objects. Thanks fo= r > > the > > >> clarification. > > >> > > >> -Bill > > >> > > >> On Sat, Jun 22, 2019 at 2:26 AM Jukka Karvanen < > > >> jukka.karvanen@jukinimi.com> > > >> wrote: > > >> > > >>> Hi All, > > >>> Hi Matthias, > > >>> > > >>>> (1) It's a little confusing that you list all method (existing, > > proposed > > >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP. May= be > > >>>> only list the ones you propose to deprecate and the new ones you > want > > to > > >>>> add? > > >>> > > >>> Ok. Unmodified methods removed. > > >>> > > >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to ad= d > > >>>> overload to initialize the timetamp and auto-advance feature > directly? > > >>>> Otherwise, uses always need to call `configureTiming` as an extra > > call? > > >>> > > >>> Added with Instant and Duration parameters. > > >>> > > >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to > > >>> `reconfigureTiming()` ? > > >>> > > >>> I removed this method when we have now initialization in constructo= r. > > >>> You can recreate TestInputTopic if needing to reconfigure timing. > > >>> > > >>> > > >>>> (4) Should we switch from `long` for timestamps to `Instant` and > > >>> `Duration` ? > > >>> This version startTimestamp is Instant and autoAdvance Duration in > > >>> Initialization and with manual configured collection pipe methods. > > >>> Now timestamp of TestRecord is still Long and similarly single reco= rd > > >>> pipeInput still has long as parameter. > > >>> Should these also converted to to Instant type? > > >>> Should there be both long and Instant parallel? > > >>> I expect there are existing test codebase which would be easier to > > >> migrate > > >>> if long could be still used. > > >>> > > >>> Also should advanceWallClockTime in TopologyTestDriver changed(or > > added > > >>> alternative) for Duration parameter. > > >>> > > >>> > > >>>> (5) Why do we have redundant getters? Or set with `getX()` and one > > >>> set without `get`-prefix? > > >>> > > >>> The methods without get-prefix are for compatibility with > > >> ProducerRecord / > > >>> ConsumerRecord and I expect would make migration to TestRecord > easier. > > >>> Standard getters in TestRecord enable writing test ignoring selecte= d > > >> fields > > >>> with hamcrest like this: > > >>> > > >>> assertThat(outputTopic.readRecord(), allOf( > > >>> hasProperty("key", equalTo(1L)), > > >>> hasProperty("value", equalTo("Hello")), > > >>> hasProperty("headers", equalTo(headers)))); > > >>> > > >>> > > >>> That's why I have currently both methods. > > >>> > > >>> Jukka > > >>> > > >>> > > >>> pe 21. kes=C3=A4k. 2019 klo 22.20 Matthias J. Sax (matthias@conflue= nt.io) > > >>> kirjoitti: > > >>> > > >>>> Thanks for the KIP. The idea to add InputTopic and OutputTopic > > >>>> abstractions is really neat! > > >>>> > > >>>> > > >>>> Couple of minor comment: > > >>>> > > >>>> (1) It's a little confusing that you list all method (existing, > > >> proposed > > >>>> to deprecate, and new one) of `TopologyTestDriver` in the KIP. May= be > > >>>> only list the ones you propose to deprecate and the new ones you > want > > >> to > > >>>> add? > > >>>> > > >>>> (Or mark all existing methods clearly -- atm, I need to got back t= o > > the > > >>>> code to read the KIP and to extract what changes are proposed). > > >>>> > > >>>> > > >>>> (2) `TopologyTestDriver#createInputTopic`: might it be worth to ad= d > > >>>> overload to initialize the timetamp and auto-advance feature > directly? > > >>>> Otherwise, uses always need to call `configureTiming` as an extra > > call? > > >>>> > > >>>> > > >>>> (3) `TestInputTopic#configureTiming()`: maybe rename to > > >>>> `reconfigureTiming()` ? > > >>>> > > >>>> > > >>>> (4) Should we switch from `long` for timestamps to `Instant` and > > >>>> `Duration` ? > > >>>> > > >>>> > > >>>> (5) Why do we have redundant getters? Or set with `getX()` and one > set > > >>>> without `get`-prefix? > > >>>> > > >>>> > > >>>> > > >>>> -Matthias > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> On 6/21/19 10:57 AM, Bill Bejeck wrote: > > >>>>> Jukka, > > >>>>> > > >>>>> Thanks for the KIP. I like the changes overall. > > >>>>> One thing I wanted to confirm, and this may be me being paranoid, > but > > >>>> will > > >>>>> the changes for input/output topic affect how the > TopologyTestDriver > > >>>> works > > >>>>> with internal topics when there are sub-topologies created? > > >>>>> > > >>>>> On Fri, Jun 21, 2019 at 12:05 PM Guozhang Wang > > > >>>> wrote: > > >>>>> > > >>>>>> 1) Got it, could you list this class along with all its function= s > in > > >>> the > > >>>>>> proposed public APIs as well? > > >>>>>> > > >>>>>> 2) Ack, thanks! > > >>>>>> > > >>>>>> On Thu, Jun 20, 2019 at 11:27 PM Jukka Karvanen < > > >>>>>> jukka.karvanen@jukinimi.com> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hi Guozhang, > > >>>>>>> > > >>>>>>> 1) This TestRecord is new class in my proposal. So it is a > > >> simplified > > >>>>>>> version of ProducerRecord and ConsumerRecord containing only th= e > > >>> fields > > >>>>>>> needed to test record content. > > >>>>>>> > > >>>>>>> 2) > > >>>>>>> public final TestInputTopic createInputTopic(final > > >>> String > > >>>>>>> topicName, final Serde keySerde, final Serde valueSerde); > > >>>>>>> public final TestOutputTopic createOutputTopic(fin= al > > >>>> String > > >>>>>>> topicName, final Serde keySerde, final Serde valueSerde); > > >>>>>>> The purpose is to create separate object for each input and > output > > >>>> topic > > >>>>>>> you are using. The topic name is given to createInput/OutputTop= ic > > >>> when > > >>>>>>> initialize topic object. > > >>>>>>> > > >>>>>>> For example: > > >>>>>>> > > >>>>>>> final TestInputTopic inputTopic1 =3D > > >>>>>>> testDriver.createInputTopic( INPUT_TOPIC, longSerde, > stringSerde); > > >>>>>>> final TestInputTopic inputTopic2 =3D > > >>>>>>> testDriver.createInputTopic( INPUT_TOPIC_MAP, longSerde, > > >>> stringSerde); > > >>>>>>> final TestOutputTopic outputTopic1 =3D > > >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC, longSerde, > stringSerde); > > >>>>>>> final TestOutputTopic outputTopic2 =3D > > >>>>>>> testDriver.createOutputTopic(OUTPUT_TOPIC_MAP, stringSerde, > > >>>>>>> longSerde); > > >>>>>>> inputTopic1.pipeInput(1L, "Hello"); > > >>>>>>> assertThat(outputTopic1.readKeyValue(), equalTo(new > KeyValue<>(1L, > > >>>>>>> "Hello"))); > > >>>>>>> assertThat(outputTopic2.readKeyValue(), equalTo(new > > >>> KeyValue<>("Hello", > > >>>>>>> 1L))); > > >>>>>>> inputTopic2.pipeInput(1L, "Hello"); > > >>>>>>> > > >>>>>>> > > >>>>>>> Jukka > > >>>>>>> > > >>>>>>> to 20. kes=C3=A4k. 2019 klo 23.52 Guozhang Wang (wangguoz@gmail= .com) > > >>>>>> kirjoitti: > > >>>>>>> > > >>>>>>>> Hello Jukka, > > >>>>>>>> > > >>>>>>>> Thanks for writing the KIP, I have a couple of quick questions= : > > >>>>>>>> > > >>>>>>>> 1) Is "TestRecord" an existing class that you propose to > > >> piggy-back > > >>>> on? > > >>>>>>>> Right now we have a scala TestRecord case class but I doubt th= at > > >> was > > >>>>>> your > > >>>>>>>> proposal, or are you proposing to add a new Java class? > > >>>>>>>> > > >>>>>>>> 2) Would the new API only allow a single input / output topic > with > > >>>>>>>> `createInput/OutputTopic`? If not, when we call pipeInput how = to > > >>>>>>> determine > > >>>>>>>> which topic this record should be pipe to? > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Guozhang > > >>>>>>>> > > >>>>>>>> On Mon, Jun 17, 2019 at 1:34 PM John Roesler > > > >>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Woah, I wasn't aware of that Hamcrest test style. Awesome! > > >>>>>>>>> > > >>>>>>>>> Thanks for the updates. I look forward to hearing what others > > >>> think. > > >>>>>>>>> > > >>>>>>>>> -John > > >>>>>>>>> > > >>>>>>>>> On Mon, Jun 17, 2019 at 4:12 AM Jukka Karvanen > > >>>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>> Wiki page updated: > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTest= Driver+test+input+and+output+usability+improvements > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> ClientRecord removed and replaced with TestRecord in method > > >> calls. > > >>>>>>>>>> TestRecordFactory removed (time tracking functionality to be > > >>>>>> included > > >>>>>>>> to > > >>>>>>>>>> TestInputTopic) > > >>>>>>>>>> OutputVerifier deprecated > > >>>>>>>>>> TestRecord topic removed and getters added > > >>>>>>>>>> > > >>>>>>>>>> Getters in TestRecord enable writing test ignoring selected > > >> fields > > >>>>>>> with > > >>>>>>>>>> hamcrest like this: > > >>>>>>>>>> > > >>>>>>>>>> assertThat(outputTopic.readRecord(), allOf( > > >>>>>>>>>> hasProperty("key", equalTo(1L)), > > >>>>>>>>>> hasProperty("value", equalTo("Hello")), > > >>>>>>>>>> hasProperty("headers", equalTo(headers)))); > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> Jukka > > >>>>>>>>>> > > >>>>>>>>>> la 15. kes=C3=A4k. 2019 klo 1.10 John Roesler (john@confluen= t.io) > > >>>>>>>> kirjoitti: > > >>>>>>>>>> > > >>>>>>>>>>> Sounds good. Thanks as always for considering my feedback! > > >>>>>>>>>>> -John > > >>>>>>>>>>> > > >>>>>>>>>>> On Fri, Jun 14, 2019 at 12:12 PM Jukka Karvanen > > >>>>>>>>>>> wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>> Ok, I will modify KIP Public Interface in a wiki based on > the > > >>>>>>>>> feedback. > > >>>>>>>>>>>> > > >>>>>>>>>>>> TestRecordFactory / ConsumerRecordFactory was used by > > >>>>>>>> TestInputTopic > > >>>>>>>>> with > > >>>>>>>>>>>> the version I had with KIP456, but maybe I can merge That > > >>>>>>>>> functionality > > >>>>>>>>>>> to > > >>>>>>>>>>>> InputTopic or TestRecordFactory can kept non public may= be > > >>>>>>> moving > > >>>>>>>>> it to > > >>>>>>>>>>>> internals package. > > >>>>>>>>>>>> > > >>>>>>>>>>>> I will make the proposal with a slim down interface. > > >>>>>>>>>>>> I don't want to go to so slim as you proposed with only > > >>>>>>> TestRecord > > >>>>>>>> or > > >>>>>>>>>>>> List, because you then still end up doing help= er > > >>>>>>>> methods > > >>>>>>>>> to > > >>>>>>>>>>>> construct List of TestRecord. > > >>>>>>>>>>>> The list of values is easier to write and clearer to read > than > > >>>>>> if > > >>>>>>>> you > > >>>>>>>>>>> need > > >>>>>>>>>>>> to contruct list of TestRecords. > > >>>>>>>>>>>> > > >>>>>>>>>>>> For example: > > >>>>>>>>>>>> > > >>>>>>>>>>>> final List inputValues =3D Arrays.asList( > > >>>>>>>>>>>> "Apache Kafka Streams Example", > > >>>>>>>>>>>> "Using Kafka Streams Test Utils", > > >>>>>>>>>>>> "Reading and Writing Kafka Topic" > > >>>>>>>>>>>> ); > > >>>>>>>>>>>> inputTopic.pipeValueList(inputValues); > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> Let's check after the next iteration is it still worth > > >> reducing > > >>>>>>> the > > >>>>>>>>>>> methods. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> Jukka > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> pe 14. kes=C3=A4k. 2019 klo 18.27 John Roesler ( > john@confluent.io) > > >>>>>>>>> kirjoitti: > > >>>>>>>>>>>> > > >>>>>>>>>>>>> Thanks, Jukka, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Ok, I buy this reasoning. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Just to echo what I think I read, you would just drop > > >>>>>>>> ClientRecord > > >>>>>>>>>>>>> from the proposal, and TestRecord would stand on its own, > > >>>>>> with > > >>>>>>>> the > > >>>>>>>>>>>>> same methods and properties you proposed, and the "input > > >>>>>> topic" > > >>>>>>>>> would > > >>>>>>>>>>>>> accept TestRecord, and the "output topic" would produce > > >>>>>>>> TestRecord? > > >>>>>>>>>>>>> Further, the "input and output topic" classes would > > >>>>>> internally > > >>>>>>>>> handle > > >>>>>>>>>>>>> the conversion to and from ConsumerRecord and > ProducerRecord > > >>>>>> to > > >>>>>>>>> pass > > >>>>>>>>>>>>> to and from the TopologyTestDriver? > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> This seems good to me. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Since the object coming out of the "output topic" is much > > >>>>>> more > > >>>>>>>>>>>>> ergonomic, I suspect we won't need the OutputVerifier at > all. > > >>>>>>> It > > >>>>>>>>> was > > >>>>>>>>>>>>> mostly needed because of all the extra junk in > ProducerRecord > > >>>>>>> you > > >>>>>>>>> need > > >>>>>>>>>>>>> to ignore. It seems better to just deprecate it. If in th= e > > >>>>>>> future > > >>>>>>>>> it > > >>>>>>>>>>>>> turns out there is some actual use case for a verifier, w= e > > >>>>>> can > > >>>>>>>>> have a > > >>>>>>>>>>>>> very small KIP to add one. But reading your response, I > > >>>>>> suspect > > >>>>>>>>> that > > >>>>>>>>>>>>> existing test verification libraries would be sufficient = on > > >>>>>>> their > > >>>>>>>>> own. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Similarly, it seems like we can shrink the total interfac= e > by > > >>>>>>>>> removing > > >>>>>>>>>>>>> the TestRecordFactory from the proposal. If TestRecord > > >>>>>> already > > >>>>>>>>> offers > > >>>>>>>>>>>>> all the constructors you'd want, then the only benefit of > the > > >>>>>>>>> factory > > >>>>>>>>>>>>> is to auto-increment the timestamps, but then again, the > > >>>>>> "input > > >>>>>>>>> topic" > > >>>>>>>>>>>>> can already do that (e.g., it can do it if the record > > >>>>>> timestamp > > >>>>>>>> is > > >>>>>>>>> not > > >>>>>>>>>>>>> set). > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Likewise, if the TestRecords are easy to create, then we > > >>>>>> don't > > >>>>>>>> need > > >>>>>>>>>>>>> all the redundant methods in "input topic" to pipe values= , > or > > >>>>>>>>>>>>> key/values, or key/value/timestamp, etc. We can do with > just > > >>>>>>> two > > >>>>>>>>>>>>> methods, one for a single TestRecord and one for a > collection > > >>>>>>> of > > >>>>>>>>> them. > > >>>>>>>>>>>>> This reduces API ambiguity and also dramatically decrease= s > > >>>>>> the > > >>>>>>>>> surface > > >>>>>>>>>>>>> area of the interface, which ultimately makes it much > easier > > >>>>>> to > > >>>>>>>>> use. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> It's best to start with the smallest interface that will = do > > >>>>>> the > > >>>>>>>> job > > >>>>>>>>>>>>> and expand it upon request, rather than throwing in > > >>>>>> everything > > >>>>>>>> you > > >>>>>>>>> can > > >>>>>>>>>>>>> think of up front. The extra stuff would be confusing to > > >>>>>> people > > >>>>>>>>> facing > > >>>>>>>>>>>>> two practically identical paths to accomplish the same > goal, > > >>>>>>> and > > >>>>>>>>> it's > > >>>>>>>>>>>>> very difficult to slim the interface down later, because = we > > >>>>>>> don't > > >>>>>>>>>>>>> really know which parts are more popular (i.e., no one > > >>>>>> submits > > >>>>>>>>>>>>> "feature requests" to _remove_ stuff they don't need, onl= y > to > > >>>>>>>> _add_ > > >>>>>>>>>>>>> stuff that they need. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> But overall, I really like the structure of this design. > I'm > > >>>>>>>> super > > >>>>>>>>>>>>> excited about this KIP. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Thanks, > > >>>>>>>>>>>>> -John > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Fri, Jun 14, 2019 at 2:55 AM Jukka Karvanen > > >>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Hi, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I am not a fan of swapping only ProducerRecord and > > >>>>>>>>> ConsumerRecord. > > >>>>>>>>>>>>>> As a test writer point of view I do not want to care abo= ut > > >>>>>>> the > > >>>>>>>>>>> difference > > >>>>>>>>>>>>>> of those and > > >>>>>>>>>>>>>> that way I like to have object type which can be used to > > >>>>>> pipe > > >>>>>>>>>>> records in > > >>>>>>>>>>>>>> and compare outputs. > > >>>>>>>>>>>>>> That way avoid unnecessary conversions between > > >>>>>> ProducerRecord > > >>>>>>>> and > > >>>>>>>>>>>>>> ConsumerRecord. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> My initial assumption was that ProducerRecord and > > >>>>>>>>>>> ConsumerRecord.could > > >>>>>>>>>>>>>> implement the same ClientRecord > > >>>>>>>>>>>>>> and that way test writer could have used either of those= , > > >>>>>> but > > >>>>>>>>> seems > > >>>>>>>>>>> that > > >>>>>>>>>>>>>> return type of timestamp method long vs Long is not > > >>>>>>> compatible. > > >>>>>>>>>>>>>> Now the main advantage of ClientRecord is no need to > > >>>>>>> duplicate > > >>>>>>>>>>>>>> OutputVerifier when it is modified from ProducerRecord t= o > > >>>>>>>>>>> ClientRecord. > > >>>>>>>>>>>>>> Generally is there need for OutputVerifier. Can we > > >>>>>> deprecate > > >>>>>>>> the > > >>>>>>>>>>> existing > > >>>>>>>>>>>>>> and use standard assertion libraries for new test. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> If you use hamcrest, assert-j or any other assertion > > >>>>>> library > > >>>>>>>>> for the > > >>>>>>>>>>>>> rest > > >>>>>>>>>>>>>> of the test, why not use it with these also. > > >>>>>>>>>>>>>> When we have these methods to access only needed fields = it > > >>>>>> is > > >>>>>>>>> easier > > >>>>>>>>>>> to > > >>>>>>>>>>>>>> write test like this: > > >>>>>>>>>>>>>> assertThat(outputTopic.readValue()).isEqualTo("Hello"); > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> or > > >>>>>>>>>>>>>> > > >>>>>>> assertThat(outputTopic.readRecord()).isEqualTo(expectedRecord); > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Only value for new OutputVerifier is when needing to > ignore > > >>>>>>>> some > > >>>>>>>>>>> fields > > >>>>>>>>>>>>>> ClientRecord actual =3D outputTopic.readRecord(); > > >>>>>>>>>>>>>> assertThat(actual.value()).isEqualTo("Hello"); > > >>>>>>>>>>>>>> assertThat(actual.key()).isEqualTo(expectedKey); > > >>>>>>>>>>>>>> > > >>>>>> assertThat(actual.timestamp()).isEqualTo(expectedTimestamp); > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> So if want to leave client package untouched, I would > > >>>>>> modify > > >>>>>>>> the > > >>>>>>>>>>> methods > > >>>>>>>>>>>>>> with ClientRecord now in InputTopic and OutputTopic to > pass > > >>>>>>> in > > >>>>>>>>> and > > >>>>>>>>>>> out > > >>>>>>>>>>>>> this > > >>>>>>>>>>>>>> TestRecord instead. > > >>>>>>>>>>>>>> In that case there would be possibility to add methods t= o > > >>>>>>>>> TestRecord > > >>>>>>>>>>> to > > >>>>>>>>>>>>>> help ignore some fields in assertions like: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>> > > >> > > > assertThat(outputTopic.readRecord().getValueTimestamp()).isEqualTo(expect= edRecord.get > > >>>>>>>>>>>>>> ValueTimestamp()); > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> How about this alternative? > > >>>>>>>>>>>>>> If this way sounds better I will modify KIP page in wiki= . > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Jukka > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> to 13. kes=C3=A4k. 2019 klo 18.30 John Roesler ( > > >>>>>> john@confluent.io > > >>>>>>> ) > > >>>>>>>>>>> kirjoitti: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Hey, all, maybe we can jump-start this discussion. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> I think this approach would be very ergonomic for > > >>>>>> testing, > > >>>>>>>> and > > >>>>>>>>>>> would > > >>>>>>>>>>>>>>> help reduce boilerplate in tests. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> The think I like most about it is that it mirrors the > > >>>>>>> mental > > >>>>>>>>> model > > >>>>>>>>>>>>>>> that people already have from Kafka Streams, in which y= ou > > >>>>>>>>> write to > > >>>>>>>>>>> an > > >>>>>>>>>>>>>>> "input topic" and then get your results from an "output > > >>>>>>>>> topic". As > > >>>>>>>>>>> a > > >>>>>>>>>>>>>>> side benefit, the input and output topics in the propos= al > > >>>>>>>> also > > >>>>>>>>>>> close > > >>>>>>>>>>>>>>> over the serdes, which makes it much less boilerplate f= or > > >>>>>>>> test > > >>>>>>>>>>> code. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> If I can offer one suggestion for change, I'm not sure > > >>>>>> I'm > > >>>>>>>>> totally > > >>>>>>>>>>>>>>> sold on the need for a new abstraction "ClientRecord" > > >>>>>> with > > >>>>>>> an > > >>>>>>>>>>>>>>> implementation for tests "TestRecord". It seems like th= is > > >>>>>>>>>>>>>>> unnecessarily complicates the main (non-testing) data > > >>>>>>> model. > > >>>>>>>> It > > >>>>>>>>>>> seems > > >>>>>>>>>>>>>>> like it would be sufficient, and just as ergonomic, to > > >>>>>> have > > >>>>>>>> the > > >>>>>>>>>>> input > > >>>>>>>>>>>>>>> topic accept ProducerRecords and the output topic retur= n > > >>>>>>>>>>>>>>> ConsumerRecords. I'm open to discussion on this point, > > >>>>>>>> though. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Thanks for the proposal, Jukka! > > >>>>>>>>>>>>>>> -John > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> On Mon, May 20, 2019 at 7:59 AM Jukka Karvanen > > >>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Hi All, > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> I would like to start the discussion on KIP-470: > > >>>>>>>>>>> TopologyTestDriver > > >>>>>>>>>>>>> test > > >>>>>>>>>>>>>>>> input and output usability improvements: > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTest= Driver+test+input+and+output+usability+improvements > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> This KIP is inspired by the Discussion in KIP-456: > > >>>>>> Helper > > >>>>>>>>>>> classes to > > >>>>>>>>>>>>> make > > >>>>>>>>>>>>>>>> it simpler to write test logic with TopologyTestDriver= : > > >>>>>>>>>>>>>>>> > > >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456 > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>> > > >> > > > %3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTe= stDriver > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> The proposal in KIP-456 > > >>>>>>>>>>>>>>>> < > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+class= es+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> was > > >>>>>>>>>>>>>>>> to add alternate way to input and output topic, but > > >>>>>> this > > >>>>>>>> KIP > > >>>>>>>>>>> enhance > > >>>>>>>>>>>>>>> those > > >>>>>>>>>>>>>>>> classes and deprecate old functionality to make clear > > >>>>>>>>> interface > > >>>>>>>>>>> for > > >>>>>>>>>>>>> test > > >>>>>>>>>>>>>>>> writer to use. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> In current KIP-470 proposal, topic objects are created > > >>>>>>> with > > >>>>>>>>>>>>> topicName and > > >>>>>>>>>>>>>>>> related serders. > > >>>>>>>>>>>>>>>> public final TestInputTopic > > >>>>>>>>>>> createInputTopic(final > > >>>>>>>>>>>>>>> String > > >>>>>>>>>>>>>>>> topicName, final Serde keySerde, final Serde > > >>>>>>>>> valueSerde); > > >>>>>>>>>>>>>>>> public final TestOutputTopic > > >>>>>>>>>>> createOutputTopic(final > > >>>>>>>>>>>>>>> String > > >>>>>>>>>>>>>>>> topicName, final Serde keySerde, final Serde > > >>>>>>>>> valueSerde); > > >>>>>>>>>>>>>>>> One thing I wondered if there way to find out topic > > >>>>>>> related > > >>>>>>>>> serde > > >>>>>>>>>>>>> from > > >>>>>>>>>>>>>>>> TopologyTestDriver topology, it would simply creation > > >>>>>> of > > >>>>>>>>> these > > >>>>>>>>>>> Topic > > >>>>>>>>>>>>>>>> objects: > > >>>>>>>>>>>>>>>> public final TestInputTopic > > >>>>>>>>>>> createInputTopic(final > > >>>>>>>>>>>>>>> String > > >>>>>>>>>>>>>>>> topicName); > > >>>>>>>>>>>>>>>> public final TestOutputTopic > > >>>>>>>>>>> createOutputTopic(final > > >>>>>>>>>>>>>>> String > > >>>>>>>>>>>>>>>> topicName); > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> KIP-456 can be discarded if this KIP get accepted. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Best Regards, > > >>>>>>>>>>>>>>>> Jukka > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> -- Guozhang > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>>> > > >>>>>> -- > > >>>>>> -- Guozhang > > >>>>>> > > >>>>> > > >>>> > > >>>> > > >>> > > >> > > > > > > > > --000000000000cd81ff058ee6049a--