flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats
Date Fri, 30 Oct 2015 19:14:27 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983109#comment-14983109
] 

ASF GitHub Bot commented on FLINK-2828:
---------------------------------------

Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1237#discussion_r43541584
  
    --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/expressions/analysis/FieldBacktracker.scala
---
    @@ -0,0 +1,75 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.api.table.expressions.analysis
    +
    +import org.apache.flink.api.table.expressions.{Naming, ResolvedFieldReference}
    +import org.apache.flink.api.table.input.{AdaptiveTableSource, TableSource}
    +import org.apache.flink.api.table.plan._
    +
    +object FieldBacktracker {
    +
    +  /**
    +   * Tracks a field back to its Root and returns its original name and AdaptiveTableSource
    +   * if possible.
    +   * This only happens if the field is forwarded unmodified. Renaming operations are
reverted.
    +   *
    +   * @param op start operator
    +   * @param fieldName field name at start operator
    +   * @return original field name with corresponding AdaptiveTableSource or null
    +   */
    +  def resolveFieldNameAndTableSource(op: PlanNode, fieldName: String):
    +      (AdaptiveTableSource, String) = {
    +    op match {
    +      case s@Select(input, selection) =>
    +        var resolvedField: (AdaptiveTableSource, String) = null
    +        // only follow unmodified fields
    +        selection.foreach {
    --- End diff --
    
    No, `selection` can have multiple elements. The lines above just make sure that at least
one Expression links to the `TableSource`'s field.


> Add interfaces for Table API input formats
> ------------------------------------------
>
>                 Key: FLINK-2828
>                 URL: https://issues.apache.org/jira/browse/FLINK-2828
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>
> In order to support input formats for the Table API, interfaces are necessary. I propose
two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the plan. Although
the output schema stays the same, the TableSource can react on field resolution and/or predicates
internally and can return adapted DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional input formats
without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment (common
super class of all ExecutionEnvironments for DataSets and DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message