Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1AFD817903 for ; Wed, 7 Oct 2015 12:45:27 +0000 (UTC) Received: (qmail 92131 invoked by uid 500); 7 Oct 2015 12:45:27 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 92073 invoked by uid 500); 7 Oct 2015 12:45:27 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 92061 invoked by uid 99); 7 Oct 2015 12:45:26 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Oct 2015 12:45:26 +0000 Date: Wed, 7 Oct 2015 12:45:26 +0000 (UTC) From: "Timo Walther (JIRA)" To: dev@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (FLINK-2828) Add interfaces for Table API input formats MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Timo Walther created FLINK-2828: ----------------------------------- Summary: Add interfaces for Table API input formats Key: FLINK-2828 URL: https://issues.apache.org/jira/browse/FLINK-2828 Project: Flink Issue Type: New Feature Components: Table API Reporter: Timo Walther Assignee: Timo Walther In order to support input formats for the Table API, interfaces are necessary. I propose two types of TableSources: - AdaptiveTableSources can adapt their output to the requirements of the plan. Although the output schema stays the same, the TableSource can react on field resolution and/or predicates internally and can return adapted DataSet/DataStream versions in the "translate" step. - StaticTableSources are an easy way to provide the Table API with additional input formats without much implementation effort (e.g. for fromCsvFile()) TableSources need to be deeply integrated into the Table API. The TableEnvironment requires a newly introduced AbstractExecutionEnvironment (common super class of all ExecutionEnvironments for DataSets and DataStreams). Here's what a TableSource can see from more complicated queries: {code} getTableJava(tableSource1) .filter("a===5 || a===6") .select("a as a4, b as b4, c as c4") .filter("b4===7") .join(getTableJava(tableSource2)) .where("a===a4 && c==='Test' && c4==='Test2'") // Result predicates for tableSource1: // List("a===5 || a===6", "b===7", "c==='Test2'") // Result predicates for tableSource2: // List("c==='Test'") // Result resolved fields for tableSource1 (true = filtering, false=selection): // Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), ("c", true)) // Result resolved fields for tableSource2 (true = filtering, false=selection): // Set(("a", true), ("c", true)) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)