kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-470: TopologyTestDriver test input and output usability improvements
Date Fri, 21 Jun 2019 19:20:21 GMT
Thanks for the KIP. The idea to add InputTopic and OutputTopic
abstractions is really neat!


Couple of minor comment:

(1) It's a little confusing that you list all method (existing, proposed
to deprecate, and new one) of `TopologyTestDriver` in the KIP. Maybe
only list the ones you propose to deprecate and the new ones you want to
add?

(Or mark all existing methods clearly -- atm, I need to got back to the
code to read the KIP and to extract what changes are proposed).


(2) `TopologyTestDriver#createInputTopic`: might it be worth to add
overload to initialize the timetamp and auto-advance feature directly?
Otherwise, uses always need to call `configureTiming` as an extra call?


(3) `TestInputTopic#configureTiming()`: maybe rename to
`reconfigureTiming()` ?


(4) Should we switch from `long` for timestamps to `Instant` and
`Duration` ?


(5) Why do we have redundant getters? Or set with `getX()` and one set
without `get`-prefix?



-Matthias




On 6/21/19 10:57 AM, Bill Bejeck wrote:
> Jukka,
> 
> Thanks for the KIP. I like the changes overall.
> One thing I wanted to confirm, and this may be me being paranoid, but will
> the changes for input/output topic affect how the TopologyTestDriver works
> with internal topics when there are sub-topologies created?
> 
> On Fri, Jun 21, 2019 at 12:05 PM Guozhang Wang <wangguoz@gmail.com> wrote:
> 
>> 1) Got it, could you list this class along with all its functions in the
>> proposed public APIs as well?
>>
>> 2) Ack, thanks!
>>
>> On Thu, Jun 20, 2019 at 11:27 PM Jukka Karvanen <
>> jukka.karvanen@jukinimi.com>
>> wrote:
>>
>>> Hi  Guozhang,
>>>
>>> 1) This TestRecord is new class in my proposal. So it is a simplified
>>> version of ProducerRecord and ConsumerRecord containing only the fields
>>> needed to test record content.
>>>
>>> 2)
>>> public final <K, V> TestInputTopic<K, V> createInputTopic(final String
>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
>>> public final <K, V> TestOutputTopic<K, V> createOutputTopic(final
String
>>> topicName, final Serde<K> keySerde, final Serde<V> valueSerde);
>>> The purpose is to create separate object for each input and output topic
>>> you are using. The topic name is given to createInput/OutputTopic when
>>> initialize topic object.
>>>
>>> For example:
>>>
>>> final TestInputTopic<Long, String> inputTopic1 =
>>> testDriver.createInputTopic( INPUT_TOPIC, longSerde, stringSerde);
>>> final TestInputTopic<Long, String> inputTopic2 =
>>> testDriver.createInputTopic( INPUT_TOPIC_MAP, longSerde, stringSerde);
>>> final TestOutputTopic<Long, String> outputTopic1 =
>>> testDriver.createOutputTopic(OUTPUT_TOPIC, longSerde, stringSerde);
>>> final TestOutputTopic<String, Long> outputTopic2 =
>>> testDriver.createOutputTopic(OUTPUT_TOPIC_MAP, stringSerde,
>>> longSerde);
>>> inputTopic1.pipeInput(1L, "Hello");
>>> assertThat(outputTopic1.readKeyValue(), equalTo(new KeyValue<>(1L,
>>> "Hello")));
>>> assertThat(outputTopic2.readKeyValue(), equalTo(new KeyValue<>("Hello",
>>> 1L)));
>>> inputTopic2.pipeInput(1L, "Hello");
>>>
>>>
>>> Jukka
>>>
>>> to 20. kesäk. 2019 klo 23.52 Guozhang Wang (wangguoz@gmail.com)
>> kirjoitti:
>>>
>>>> Hello Jukka,
>>>>
>>>> Thanks for writing the KIP, I have a couple of quick questions:
>>>>
>>>> 1) Is "TestRecord" an existing class that you propose to piggy-back on?
>>>> Right now we have a scala TestRecord case class but I doubt that was
>> your
>>>> proposal, or are you proposing to add a new Java class?
>>>>
>>>> 2) Would the new API only allow a single input / output topic with
>>>> `createInput/OutputTopic`? If not, when we call pipeInput how to
>>> determine
>>>> which topic this record should be pipe to?
>>>>
>>>>
>>>> Guozhang
>>>>
>>>> On Mon, Jun 17, 2019 at 1:34 PM John Roesler <john@confluent.io>
>> wrote:
>>>>
>>>>> Woah, I wasn't aware of that Hamcrest test style. Awesome!
>>>>>
>>>>> Thanks for the updates. I look forward to hearing what others think.
>>>>>
>>>>> -John
>>>>>
>>>>> On Mon, Jun 17, 2019 at 4:12 AM Jukka Karvanen
>>>>> <jukka.karvanen@jukinimi.com> wrote:
>>>>>>
>>>>>> Wiki page updated:
>>>>>>
>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
>>>>>>
>>>>>>
>>>>>> ClientRecord removed and replaced with TestRecord in method calls.
>>>>>> TestRecordFactory removed (time tracking functionality to be
>> included
>>>> to
>>>>>> TestInputTopic)
>>>>>> OutputVerifier deprecated
>>>>>> TestRecord topic removed and getters added
>>>>>>
>>>>>> Getters in TestRecord enable writing test ignoring selected fields
>>> with
>>>>>> hamcrest like this:
>>>>>>
>>>>>> assertThat(outputTopic.readRecord(), allOf(
>>>>>>         hasProperty("key", equalTo(1L)),
>>>>>>         hasProperty("value", equalTo("Hello")),
>>>>>>         hasProperty("headers", equalTo(headers))));
>>>>>>
>>>>>>
>>>>>> Jukka
>>>>>>
>>>>>> la 15. kesäk. 2019 klo 1.10 John Roesler (john@confluent.io)
>>>> kirjoitti:
>>>>>>
>>>>>>> Sounds good. Thanks as always for considering my feedback!
>>>>>>> -John
>>>>>>>
>>>>>>> On Fri, Jun 14, 2019 at 12:12 PM Jukka Karvanen
>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
>>>>>>>>
>>>>>>>> Ok, I will modify KIP Public Interface in a wiki based on
the
>>>>> feedback.
>>>>>>>>
>>>>>>>> TestRecordFactory / ConsumerRecordFactory was used by
>>>> TestInputTopic
>>>>> with
>>>>>>>> the version I had with KIP456, but maybe I can merge That
>>>>> functionality
>>>>>>> to
>>>>>>>> InputTopic or  TestRecordFactory   can kept non public maybe
>>> moving
>>>>> it to
>>>>>>>> internals package.
>>>>>>>>
>>>>>>>> I will make the proposal with a slim down interface.
>>>>>>>> I don't want to go to so slim as you proposed with only
>>> TestRecord
>>>> or
>>>>>>>> List<TestRecord>, because you then still end up doing
helper
>>>> methods
>>>>> to
>>>>>>>> construct List of TestRecord.
>>>>>>>> The list of values is easier to write and clearer to read
than
>> if
>>>> you
>>>>>>> need
>>>>>>>> to contruct list of TestRecords.
>>>>>>>>
>>>>>>>> For example:
>>>>>>>>
>>>>>>>> final List<String> inputValues = Arrays.asList(
>>>>>>>>         "Apache Kafka Streams Example",
>>>>>>>>         "Using Kafka Streams Test Utils",
>>>>>>>>         "Reading and Writing Kafka Topic"
>>>>>>>> );
>>>>>>>> inputTopic.pipeValueList(inputValues);
>>>>>>>>
>>>>>>>>
>>>>>>>> Let's check after the next iteration is it still worth reducing
>>> the
>>>>>>> methods.
>>>>>>>>
>>>>>>>>
>>>>>>>> Jukka
>>>>>>>>
>>>>>>>>
>>>>>>>> pe 14. kesäk. 2019 klo 18.27 John Roesler (john@confluent.io)
>>>>> kirjoitti:
>>>>>>>>
>>>>>>>>> Thanks, Jukka,
>>>>>>>>>
>>>>>>>>> Ok, I buy this reasoning.
>>>>>>>>>
>>>>>>>>> Just to echo what I think I read, you would just drop
>>>> ClientRecord
>>>>>>>>> from the proposal, and TestRecord would stand on its
own,
>> with
>>>> the
>>>>>>>>> same methods and properties you proposed, and the "input
>> topic"
>>>>> would
>>>>>>>>> accept TestRecord, and the "output topic" would produce
>>>> TestRecord?
>>>>>>>>> Further, the "input and output topic" classes would
>> internally
>>>>> handle
>>>>>>>>> the conversion to and from ConsumerRecord and ProducerRecord
>> to
>>>>> pass
>>>>>>>>> to and from the TopologyTestDriver?
>>>>>>>>>
>>>>>>>>> This seems good to me.
>>>>>>>>>
>>>>>>>>> Since the object coming out of the "output topic" is
much
>> more
>>>>>>>>> ergonomic, I suspect we won't need the OutputVerifier
at all.
>>> It
>>>>> was
>>>>>>>>> mostly needed because of all the extra junk in ProducerRecord
>>> you
>>>>> need
>>>>>>>>> to ignore. It seems better to just deprecate it. If in
the
>>> future
>>>>> it
>>>>>>>>> turns out there is some actual use case for a verifier,
we
>> can
>>>>> have a
>>>>>>>>> very small KIP to add one. But reading your response,
I
>> suspect
>>>>> that
>>>>>>>>> existing test verification libraries would be sufficient
on
>>> their
>>>>> own.
>>>>>>>>>
>>>>>>>>> Similarly, it seems like we can shrink the total interface
by
>>>>> removing
>>>>>>>>> the TestRecordFactory from the proposal. If TestRecord
>> already
>>>>> offers
>>>>>>>>> all the constructors you'd want, then the only benefit
of the
>>>>> factory
>>>>>>>>> is to auto-increment the timestamps, but then again,
the
>> "input
>>>>> topic"
>>>>>>>>> can already do that (e.g., it can do it if the record
>> timestamp
>>>> is
>>>>> not
>>>>>>>>> set).
>>>>>>>>>
>>>>>>>>> Likewise, if the TestRecords are easy to create, then
we
>> don't
>>>> need
>>>>>>>>> all the redundant methods in "input topic" to pipe values,
or
>>>>>>>>> key/values, or key/value/timestamp, etc. We can do with
just
>>> two
>>>>>>>>> methods, one for a single TestRecord and one for a collection
>>> of
>>>>> them.
>>>>>>>>> This reduces API ambiguity and also dramatically decreases
>> the
>>>>> surface
>>>>>>>>> area of the interface, which ultimately makes it much
easier
>> to
>>>>> use.
>>>>>>>>>
>>>>>>>>> It's best to start with the smallest interface that will
do
>> the
>>>> job
>>>>>>>>> and expand it upon request, rather than throwing in
>> everything
>>>> you
>>>>> can
>>>>>>>>> think of up front. The extra stuff would be confusing
to
>> people
>>>>> facing
>>>>>>>>> two practically identical paths to accomplish the same
goal,
>>> and
>>>>> it's
>>>>>>>>> very difficult to slim the interface down later, because
we
>>> don't
>>>>>>>>> really know which parts are more popular (i.e., no one
>> submits
>>>>>>>>> "feature requests" to _remove_ stuff they don't need,
only to
>>>> _add_
>>>>>>>>> stuff that they need.
>>>>>>>>>
>>>>>>>>> But overall, I really like the structure of this design.
I'm
>>>> super
>>>>>>>>> excited about this KIP.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>> On Fri, Jun 14, 2019 at 2:55 AM Jukka Karvanen
>>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am not a fan of swapping only ProducerRecord and
>>>>> ConsumerRecord.
>>>>>>>>>> As a test writer point of view I do not want to care
about
>>> the
>>>>>>> difference
>>>>>>>>>> of those and
>>>>>>>>>> that way I like to have object type which can be
used to
>> pipe
>>>>>>> records in
>>>>>>>>>> and compare outputs.
>>>>>>>>>> That way avoid unnecessary conversions between
>> ProducerRecord
>>>> and
>>>>>>>>>> ConsumerRecord.
>>>>>>>>>>
>>>>>>>>>> My initial assumption was that ProducerRecord and
>>>>>>> ConsumerRecord.could
>>>>>>>>>> implement the same ClientRecord
>>>>>>>>>> and that way test writer could have used either of
those,
>> but
>>>>> seems
>>>>>>> that
>>>>>>>>>> return type of timestamp method long vs Long is not
>>> compatible.
>>>>>>>>>> Now the main advantage of ClientRecord is no need
to
>>> duplicate
>>>>>>>>>> OutputVerifier when it is modified from ProducerRecord
to
>>>>>>> ClientRecord.
>>>>>>>>>> Generally is there need for OutputVerifier. Can we
>> deprecate
>>>> the
>>>>>>> existing
>>>>>>>>>> and use standard assertion libraries for new test.
>>>>>>>>>>
>>>>>>>>>> If you use hamcrest, assert-j or any other assertion
>> library
>>>>> for the
>>>>>>>>> rest
>>>>>>>>>> of the test, why not use it with these also.
>>>>>>>>>> When we have these methods to access only needed
fields it
>> is
>>>>> easier
>>>>>>> to
>>>>>>>>>> write test like this:
>>>>>>>>>> assertThat(outputTopic.readValue()).isEqualTo("Hello");
>>>>>>>>>>
>>>>>>>>>> or
>>>>>>>>>>
>>> assertThat(outputTopic.readRecord()).isEqualTo(expectedRecord);
>>>>>>>>>>
>>>>>>>>>> Only value for new OutputVerifier is when needing
to ignore
>>>> some
>>>>>>> fields
>>>>>>>>>> ClientRecord actual = outputTopic.readRecord();
>>>>>>>>>> assertThat(actual.value()).isEqualTo("Hello");
>>>>>>>>>> assertThat(actual.key()).isEqualTo(expectedKey);
>>>>>>>>>>
>> assertThat(actual.timestamp()).isEqualTo(expectedTimestamp);
>>>>>>>>>>
>>>>>>>>>> So if want to leave client package untouched, I would
>> modify
>>>> the
>>>>>>> methods
>>>>>>>>>> with ClientRecord now in InputTopic and OutputTopic
to pass
>>> in
>>>>> and
>>>>>>> out
>>>>>>>>> this
>>>>>>>>>> TestRecord instead.
>>>>>>>>>> In that case there would be possibility to add methods
to
>>>>> TestRecord
>>>>>>> to
>>>>>>>>>> help ignore some fields in assertions like:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>> assertThat(outputTopic.readRecord().getValueTimestamp()).isEqualTo(expectedRecord.get
>>>>>>>>>> ValueTimestamp());
>>>>>>>>>>
>>>>>>>>>> How about this alternative?
>>>>>>>>>> If this way sounds better I will modify KIP page
in wiki.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jukka
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> to 13. kesäk. 2019 klo 18.30 John Roesler (
>> john@confluent.io
>>> )
>>>>>>> kirjoitti:
>>>>>>>>>>
>>>>>>>>>>> Hey, all, maybe we can jump-start this discussion.
>>>>>>>>>>>
>>>>>>>>>>> I think this approach would be very ergonomic
for
>> testing,
>>>> and
>>>>>>> would
>>>>>>>>>>> help reduce boilerplate in tests.
>>>>>>>>>>>
>>>>>>>>>>> The think I like most about it is that it mirrors
the
>>> mental
>>>>> model
>>>>>>>>>>> that people already have from Kafka Streams,
in which you
>>>>> write to
>>>>>>> an
>>>>>>>>>>> "input topic" and then get your results from
an "output
>>>>> topic". As
>>>>>>> a
>>>>>>>>>>> side benefit, the input and output topics in
the proposal
>>>> also
>>>>>>> close
>>>>>>>>>>> over the serdes, which makes it much less boilerplate
for
>>>> test
>>>>>>> code.
>>>>>>>>>>>
>>>>>>>>>>> If I can offer one suggestion for change, I'm
not sure
>> I'm
>>>>> totally
>>>>>>>>>>> sold on the need for a new abstraction "ClientRecord"
>> with
>>> an
>>>>>>>>>>> implementation for tests "TestRecord". It seems
like this
>>>>>>>>>>> unnecessarily complicates the main (non-testing)
data
>>> model.
>>>> It
>>>>>>> seems
>>>>>>>>>>> like it would be sufficient, and just as ergonomic,
to
>> have
>>>> the
>>>>>>> input
>>>>>>>>>>> topic accept ProducerRecords and the output topic
return
>>>>>>>>>>> ConsumerRecords. I'm open to discussion on this
point,
>>>> though.
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the proposal, Jukka!
>>>>>>>>>>> -John
>>>>>>>>>>>
>>>>>>>>>>> On Mon, May 20, 2019 at 7:59 AM Jukka Karvanen
>>>>>>>>>>> <jukka.karvanen@jukinimi.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to start the discussion on KIP-470:
>>>>>>> TopologyTestDriver
>>>>>>>>> test
>>>>>>>>>>>> input and output usability improvements:
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-470%3A+TopologyTestDriver+test+input+and+output+usability+improvements
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This KIP is inspired by the Discussion in
KIP-456:
>> Helper
>>>>>>> classes to
>>>>>>>>> make
>>>>>>>>>>>> it simpler to write test logic with TopologyTestDriver:
>>>>>>>>>>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>> %3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The proposal in KIP-456
>>>>>>>>>>>> <
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-456%3A+Helper+classes+to+make+it+simpler+to+write+test+logic+with+TopologyTestDriver
>>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>>>>>>> to add alternate way to input and output
topic, but
>> this
>>>> KIP
>>>>>>> enhance
>>>>>>>>>>> those
>>>>>>>>>>>> classes and deprecate old functionality to
make clear
>>>>> interface
>>>>>>> for
>>>>>>>>> test
>>>>>>>>>>>> writer to use.
>>>>>>>>>>>>
>>>>>>>>>>>> In current KIP-470 proposal, topic objects
are created
>>> with
>>>>>>>>> topicName and
>>>>>>>>>>>> related serders.
>>>>>>>>>>>>     public final <K, V> TestInputTopic<K,
V>
>>>>>>> createInputTopic(final
>>>>>>>>>>> String
>>>>>>>>>>>> topicName, final Serde<K> keySerde,
final Serde<V>
>>>>> valueSerde);
>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K,
V>
>>>>>>> createOutputTopic(final
>>>>>>>>>>> String
>>>>>>>>>>>> topicName, final Serde<K> keySerde,
final Serde<V>
>>>>> valueSerde);
>>>>>>>>>>>> One thing I wondered if there way to find
out topic
>>> related
>>>>> serde
>>>>>>>>> from
>>>>>>>>>>>> TopologyTestDriver topology, it would simply
creation
>> of
>>>>> these
>>>>>>> Topic
>>>>>>>>>>>> objects:
>>>>>>>>>>>>     public final <K, V> TestInputTopic<K,
V>
>>>>>>> createInputTopic(final
>>>>>>>>>>> String
>>>>>>>>>>>> topicName);
>>>>>>>>>>>>     public final <K, V> TestOutputTopic<K,
V>
>>>>>>> createOutputTopic(final
>>>>>>>>>>> String
>>>>>>>>>>>> topicName);
>>>>>>>>>>>>
>>>>>>>>>>>> KIP-456 can be discarded if this KIP get
accepted.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Jukka
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>
>>
>> --
>> -- Guozhang
>>
> 


Mime
View raw message