kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evgeny Veretennikov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4468) Correctly calculate the window end timestamp after read from state stores
Date Thu, 06 Jul 2017 10:04:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076281#comment-16076281

Evgeny Veretennikov commented on KAFKA-4468:

I have researched windowed stores a bit more and noticed, that {{WindowedDeserializer}} isn't
used in Kafka source codes, including internals:

$ grep -Rl WindowedDeserializer | grep java | grep -v test

It's not a part of API too (as it's in {{internals}} package). {{RocksDBWindowStore}} doesn't
use {{WindowedDeserializer}} to deserialize windowed keys, but uses {{WindowStoreUtils}} static
methods instead, including {{timeWindowForSize()}} method, which already uses {{windowSize}}
to calculate proper {{TimeWindow}}.

So, shouldn't we just remove {{WindowedDeserializer}}?

> Correctly calculate the window end timestamp after read from state stores
> -------------------------------------------------------------------------
>                 Key: KAFKA-4468
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4468
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Guozhang Wang
>              Labels: architecture
> When storing the WindowedStore on the persistent KV store, we only use the start timestamp
of the window as part of the combo-key as (start-timestamp, key). The reason that we do not
add the end-timestamp as well is that we can always calculate it from the start timestamp
+ window_length, and hence we can save 8 bytes per key on the persistent KV store.
> However, after read it (via {{WindowedDeserializer}}) we do not set its end timestamp
correctly but just read it as an {{UnlimitedWindow}}. We should fix this by calculating its
end timestamp as mentioned above.

This message was sent by Atlassian JIRA

View raw message