flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9524) NPE from ProcTimeBoundedRangeOver.scala
Date Tue, 05 Jun 2018 10:06:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501555#comment-16501555

Fabian Hueske commented on FLINK-9524:

Hi [~yzandrew], thanks for digging into this issue!

It would be great, if you would work on it.

One thing, that I'd like to understand first is why the state was cleared and returns {{null}}.

I had a look at the code and did not spot a case that would cause an early state cleanup.
So before we add a {{null}}-check I'd like to understand why there's a {{null}} since this
might indicate a bigger problem and catching the {{null}} might just hide it.

Let me have a look at the operator again.

> NPE from ProcTimeBoundedRangeOver.scala
> ---------------------------------------
>                 Key: FLINK-9524
>                 URL: https://issues.apache.org/jira/browse/FLINK-9524
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API &amp; SQL
>    Affects Versions: 1.5.0
>            Reporter: yan zhou
>            Priority: Major
>         Attachments: npe_from_ProcTimeBoundedRangeOver.txt
> The class _ProcTimeBoundedRangeOver_ would throws NPE if _minRetentionTime_ and _maxRetentionTime_
are set to greater then 1. 
> Please see [^npe_from_ProcTimeBoundedRangeOver.txt] for the detail of  exception. Below
is a short description of the cause:
>  * When the first event for a key arrives,  the cleanup time is registered with _timerservice_
and recorded in _cleanupTimeState_. If the second event with same key arrives before the cleanup
time, the value in _cleanupTimeState_ is updated and a new timer is registered to _timerService_.
So now we have two registered timers for cleanup. One is registered because of the first event,
the other for the second event.
>  * However, when _onTimer_ method is fired for the first cleanup timer, the _cleanupTimeStates_
value has already been updated to second cleanup time. So it will bypass the _needToCleanupState_
check, and yet run through the remained code of _onTimer_ (which is intended to update the
accumulator and emit output) and cause NPE.
> _RowTimeBoundedRangeOver_ has very similar logic with _ProcTimeBoundedRangeOver. But_ It
won't cause NPE by the same reason. To avoid the exception, it simply add a null check before
running the logic for updating accumulator.

This message was sent by Atlassian JIRA

View raw message