flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bortoli <stefano.bort...@huawei.com>
Subject flink-table sql overlaps time.scala
Date Tue, 30 May 2017 12:24:02 GMT
Hi all,

I am playing around with the table API, and I have a doubt about temporal operator overlaps.
In particular, a test in the scalarFunctionsTest.testOverlaps checks for false the following
intervals:
testAllApis(
      temporalOverlaps("2011-03-10 05:02:02".toTimestamp, 0.second,
        "2011-03-10 05:02:02".toTimestamp, "2011-03-10 05:02:01".toTimestamp),
      "temporalOverlaps(toTimestamp('2011-03-10 05:02:02'), 0.second, " +
        "'2011-03-10 05:02:02'.toTimestamp, '2011-03-10 05:02:01'.toTimestamp)",
      "(TIMESTAMP '2011-03-10 05:02:02', INTERVAL '0' SECOND) OVERLAPS " +
        "(TIMESTAMP '2011-03-10 05:02:02', TIMESTAMP '2011-03-10 05:02:01')",
      "false")

Basically, the compared intervals overlap just by one of the extreme. The interpretation of
the time.scala implementation is
AND(
                        >=(DATETIME_PLUS(CAST('2011-03-10 05:02:02'):TIMESTAMP(3) NOT NULL,
0), CAST('2011-03-10 05:02:02'):TIMESTAMP(3) NOT NULL),
                        >=(CAST('2011-03-10 05:02:01'):TIMESTAMP(3) NOT NULL, CAST('2011-03-10
05:02:02'):TIMESTAMP(3) NOT NULL)
),

Where the result is false as the second clause is not satisfied.

However, latest calcite master compiles the overlaps as follows:
[AND
            (
                        >=(      CASE(
                                                <=(2011-03-10 05:02:02, DATETIME_PLUS(2011-03-10
05:02:02, 0)), DATETIME_PLUS(2011-03-10 05:02:02, 0), 2011-03-10 05:02:02
                                                ),
                                    CASE(
                                                <=(2011-03-10 05:02:02, 2011-03-10 05:02:01),
2011-03-10 05:02:02, 2011-03-10 05:02:01
                                                )
                        ),
                        >=(      CASE(
                                                <=(2011-03-10 05:02:02, 2011-03-10 05:02:01),
2011-03-10 05:02:01, 2011-03-10 05:02:02
                                                ),
                                    CASE(
                                                <=(2011-03-10 05:02:02, DATETIME_PLUS(2011-03-10
05:02:02, 0)), 2011-03-10 05:02:02, DATETIME_PLUS(2011-03-10 05:02:02, 0)
                                    )
                        )
            )
]

Where the result is true.

I believe the issue is about interpreting the extremes as part of the overlapping intervals
or not. Flink does not consider the intervals as overlapping (as the test shows), whereas
Calcite implements the test including them.

Which one should be preserved?

I think that calcite implementation is correct, and overlapping extremes should be considered.
What do you think?

Best,
Stefano

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message