flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9524) NPE from ProcTimeBoundedRangeOver.scala
Date Tue, 05 Jun 2018 10:19:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501564#comment-16501564

Fabian Hueske commented on FLINK-9524:

Alright, I've found the reason for the {{null}} value. The problem are clean-up timers that
expired, i.e., timers that were registered as clean-up timers but for which {{needToCleanupState(timestamp)}}
returns {{false}}.

Since we can distinguish different types of timers (regular processing & clean-up), we
should add the {{null}} check that you propose.
Please also add a comment, why this check is necessary.

Thank you.

> NPE from ProcTimeBoundedRangeOver.scala
> ---------------------------------------
>                 Key: FLINK-9524
>                 URL: https://issues.apache.org/jira/browse/FLINK-9524
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API &amp; SQL
>    Affects Versions: 1.5.0
>            Reporter: yan zhou
>            Priority: Major
>         Attachments: npe_from_ProcTimeBoundedRangeOver.txt
> The class _ProcTimeBoundedRangeOver_ would throws NPE if _minRetentionTime_ and _maxRetentionTime_
are set to greater then 1. 
> Please see [^npe_from_ProcTimeBoundedRangeOver.txt] for the detail of  exception. Below
is a short description of the cause:
>  * When the first event for a key arrives,  the cleanup time is registered with _timerservice_
and recorded in _cleanupTimeState_. If the second event with same key arrives before the cleanup
time, the value in _cleanupTimeState_ is updated and a new timer is registered to _timerService_.
So now we have two registered timers for cleanup. One is registered because of the first event,
the other for the second event.
>  * However, when _onTimer_ method is fired for the first cleanup timer, the _cleanupTimeStates_
value has already been updated to second cleanup time. So it will bypass the _needToCleanupState_
check, and yet run through the remained code of _onTimer_ (which is intended to update the
accumulator and emit output) and cause NPE.
> _RowTimeBoundedRangeOver_ has very similar logic with _ProcTimeBoundedRangeOver. But_ It
won't cause NPE by the same reason. To avoid the exception, it simply add a null check before
running the logic for updating accumulator.

This message was sent by Atlassian JIRA

View raw message