kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-472: Add header to RecordContext
Date Sat, 01 Jun 2019 00:36:09 GMT
Sure thing :)

For the original issue to preserve partition-time within a task across
restarts, we don't need this KIP.

And I agree that downstream time propagation via heartbeats is nothing
critical atm but may require a big change. Hence, maybe not worth the
effort right now.

If you are fine with it, we can discard this KIP for now. Maybe we can
revisit this idea at a later point when we really need it.

Thanks a lot!


-Matthias

On 5/31/19 4:19 PM, Richard Yu wrote:
> Hi Matthias,
> 
> Thanks for responding. :)
> 
> I suppose from the scope of the change that is needed to fix the timestamp
> propagation bug is too complex (in other words, probably not worth it).
> So should this KIP be closed since it might be a little excessive?
> 
> There probably is no need for too big of a change like this one.
> 
> On Fri, May 31, 2019 at 3:17 PM Matthias J. Sax <matthias@confluent.io>
> wrote:
> 
>> Thanks for the KIP.
>>
>> However, I have some doubts that this would work.
>>
>> In the example, you mention 4 records
>>
>>> r1(2), r2(3), r3(7), and r4(9)
>>
>> and that if r3 and r4 would be filtered, the downstream task would not
>> advance partition time from 3 to 9.
>>
>> However, if r3 and r4 are filtered, no record will be sent downstream at
>> all -- hence, there is no record to which partition-time could be
>> piggy-bagged onto its header.
>>
>> When we sent r2, we don't know anything about r3 and r4 yet. And if
>> there is an r5 that is not filtered, r5 would advance partition time
>> anyway based on its own timestamp, hence no header is required.
>>
>> Also, I am not a big fan of adding headers in general, as they "leak"
>> internal implementation details.
>>
>>
>> Overall, my personal opinion is, that we should change Kafka's message
>> format and allow for "heartbeat" messages, that don't carry any data,
>> but only a timestamp. By default, those messages would not be exposed to
>> an application but would be considered "internal" similar to transaction
>> markers. However, changing the message format is a mayor change and
>> hence, I am not sure if it worth doing at all atm.
>>
>>
>> -Matthias
>>
>>
>>
>> On 5/20/19 7:20 PM, Richard Yu wrote:
>>> Hello,
>>>
>>> I wish to introduce a minor addition present in RecordContext (a public
>>> facing API). This addition works to both provide the user with
>>> more information regarding the processing state of the partition, but
>> also
>>> help resolve a bug which Kafka is currently experiencing.
>>> Here is the KIP Link:
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-472%3A+%5BSTREAMS%5D+Add+partition+time+field+to+RecordContext
>>>
>>> Cheers,
>>> Richard Yu
>>>
>>
>>
> 


Mime
View raw message