flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Preparing Table API & SQL for Flink 1.1.0
Date Thu, 19 May 2016 15:59:51 GMT
Hi everybody,

I'd like to start a discussion about blocking issues and outstanding
features of the Table API and SQL for the 1.1.0 release. As you probably
know, the Table API was completely reworked and ported to Apache Calcite.
Moreover, we added initial support for SQL on batch and streaming tables.

We have come quite far but there are still a couple of issue that need to
be resolved before we can release a new version of Flink. I would like to
start collecting and prioritizing issues such that we can work towards a
feature set that we would like to be included in the next release. In order
to prepare this list, I tried to execute the TPC-H query set using the
currently supported SQL feature set. Only one (Q18) out  of the 22 queries
could be executed. The others failed due to unsupported features or bugs.

In the following, I list issues ordered by priority that I think need be
resolved for the release.

    - FLINK-3728:  Detect unsupported operators and improve error messages.
While we can effectively prevent unsupported operations in the Table API,
this is not easily possible with SQL queries. At the moment, unsupported
operations are either not detected and translated into invalid plans or
throw a hard to understand exceptions.
    - FLINK-3859: Add support for DECIMAL. Without this feature, it is not
possible to use floating point literals in SQL queries.
    - FLINK-3152 / FLINK-3580: Add support for date types and date
functions.
    - FLINK-3586: Prevent AVG(LONG) overflow by using BigInteger as
intermediate data type.
    - FLINK-2971: Add support for outer joins (a PR for this issue exists
#1981)
    - FLINK-3936 : Add MIN / MAX aggregation function for BOOLEAN types
    - FLINK-3916: Add support for generic types which are handled by
    - FLINK-3723: This is an proposal to split the Table API select()
method into select() for projection and aggregate() for aggregations. At
the moment, both are handled by select() (such as in SQL) and internally
separated by the Table API. We should decide for Flink 1.1.0 whether to
implement the proposal or not.
    - FLINK-3871 / FLINK-3873: Add Table Source and TableSink for Avro
encoded Kafka sources
    - FLINK-3872 / FLINK-3874 : Add TableSource and TableSink for JSON
encoded Kafka sources
    - More TableSource / TableSinks

Please review this list, add issues that you think should go in as well,
and discuss the priorities of the features.
Also if you would like to get involved with improving the Table API / SQL,
drop a mail to the mailing list or a comment to a JIRA issue.

I think it would be good if somebody would coordinate these efforts. I
would be happy to do it. However, I will leave in one month for a
two-months parental leave and I don't know how much I can contribute in
that time. So if somebody would like to step up and help coordinating,
please let me and the others know.

Cheers, Fabian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message