flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xccui <...@git.apache.org>
Subject [GitHub] flink pull request #4625: [FLINK-6233] [table] Support time-bounded stream i...
Date Tue, 19 Sep 2017 02:36:49 GMT
Github user xccui commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4625#discussion_r139585456
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeBoundedStreamInnerJoin.scala
---
    @@ -0,0 +1,74 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.table.runtime.join
    +
    +import org.apache.flink.api.common.typeinfo.TypeInformation
    +import org.apache.flink.streaming.api.functions.co.CoProcessFunction
    +import org.apache.flink.table.runtime.types.CRow
    +import org.apache.flink.types.Row
    +
    +/**
    +  * The function to execute processing time bounded stream inner-join.
    +  */
    +class ProcTimeBoundedStreamInnerJoin(
    +    leftLowerBound: Long,
    +    leftUpperBound: Long,
    +    allowedLateness: Long,
    +    leftType: TypeInformation[Row],
    +    rightType: TypeInformation[Row],
    +    genJoinFuncName: String,
    +    genJoinFuncCode: String)
    +    extends TimeBoundedStreamInnerJoin(
    +      leftLowerBound,
    +      leftUpperBound,
    +      allowedLateness,
    +      leftType,
    +      rightType,
    +      genJoinFuncName,
    +      genJoinFuncCode,
    +      leftTimeIdx = -1,
    +      rightTimeIdx = -1,
    +      JoinTimeIndicator.PROCTIME) {
    +
    +  override def checkRowOutOfDate(timeForRow: Long, watermark: Long) = false
    +
    +  override def updateOperatorTime(ctx: CoProcessFunction[CRow, CRow, CRow]#Context):
Unit = {
    +    rightOperatorTime = ctx.timerService().currentProcessingTime()
    +    leftOperatorTime = ctx.timerService().currentProcessingTime()
    +  }
    +
    +  override def getTimeForLeftStream(
    +      context: CoProcessFunction[CRow, CRow, CRow]#Context,
    +      row: CRow): Long = {
    +    context.timerService().currentProcessingTime()
    --- End diff --
    
    Yes, you are right. To keep them identical, we should return the `leftOperatorTime` here.
However, this makes `updateOperatorTime` and `getTimeForLeftStream` coupled, i.e., `updateOperatorTime`
must be invoked before `getTimeForLeftStream`. Can we bear this? 
    
    I've got an idea about the processing time. How about temporarily caching the machine
time for the same `StreamRecord` instead of invoking the `System.currentTimeMillis()` each
time?


---

Mime
View raw message