flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7388) ProcessFunction.onTimer() sets processing time as timestamp
Date Mon, 09 Oct 2017 15:33:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197158#comment-16197158
] 

ASF GitHub Bot commented on FLINK-7388:
---------------------------------------

Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4786#discussion_r143502884
  
    --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/KeyedProcessOperator.java
---
    @@ -79,7 +79,6 @@ public void onEventTime(InternalTimer<K, VoidNamespace> timer)
throws Exception
     
     	@Override
     	public void onProcessingTime(InternalTimer<K, VoidNamespace> timer) throws Exception
{
    -		collector.setAbsoluteTimestamp(timer.getTimestamp());
    --- End diff --
    
    This should call `eraseTimestamp()` because we might still have a timestamp set from processing
some previous elements. Same for the other occurrences in the code.


> ProcessFunction.onTimer() sets processing time as timestamp
> -----------------------------------------------------------
>
>                 Key: FLINK-7388
>                 URL: https://issues.apache.org/jira/browse/FLINK-7388
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataStream API
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Fabian Hueske
>            Assignee: Bowen Li
>             Fix For: 1.4.0
>
>
> The {{ProcessFunction.onTimer()}} method sets the current processing time as event-time
timestamp when it is called from a processing time timer.
> I don't think this behavior is useful. Processing time timestamps won't be aligned with
watermarks and are not deterministic. The only reason would be to have _some_ value in the
timestamp field. However, the behavior is very subtle and might not be noticed by users.
> IMO, it would be better to erase the timestamp. This will cause downstream operator that
rely on timestamps to fail and notify the users that the logic they implemented was probably
not what they intended to do.
> What do you think [~aljoscha]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message