flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timo Walther (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-10618) Introduce catalog for Flink tables
Date Wed, 31 Oct 2018 15:42:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670261#comment-16670261

Timo Walther commented on FLINK-10618:

Yes, a design doc would be very helpful. It's design and implementation should go along with
a DDL (see FLINK-10232) that describes all the entities that a catalog can store; e.g. connectors,
views, functions but also libraries or types. [~suez1224] might tell you more about the latter

In order to reduce code duplication, we should also make a table environment itself implement
a catalog interface or at least contain an in-memory catalog implementation. A table environment
can then be considered as the {{default}} catalog.

Since we are planning to move away from Scala, reworking the table environment as a catalog
would also be a perfect time to clean up the interfaces and migrate them to Java. But this
is still up for discussion and might require a separate issue.

> Introduce catalog for Flink tables
> ----------------------------------
>                 Key: FLINK-10618
>                 URL: https://issues.apache.org/jira/browse/FLINK-10618
>             Project: Flink
>          Issue Type: New Feature
>          Components: SQL Client
>    Affects Versions: 1.6.1
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>            Priority: Major
> Besides meta objects such as tables that may come from an {{ExternalCatalog}}, Flink
also deals with tables/views/functions that are created on the fly (in memory), or specified
in a configuration file. Those objects don't belong to any {{ExternalCatalog}}, yet Flink
either stores them in memory, which are non-persistent, or recreates them from a file, which
is a big pain for the user. Those objects are only known to Flink but Flink has a poor management
for them.
> Since they are typical objects in a database catalog, it's natural to have a catalog
that manages those objects. The interface will be similar to {{ExternalCatalog}}, which contains
meta objects that are not managed by Flink. There are several possible implementations of
the Flink internal catalog interface: memory, file, external registry (such as confluent schema
registry or Hive metastore), and relational database, etc. 
> The initial functionality as well as the catalog hierarchy could be very simple. The
basic functionality of the catalog will be mostly create, alter, and drop tables, views, functions,
etc. Obviously, this can evolve over the time.
> We plan to provide implementations with memory, file, and Hive metastore, and will be
plugged in at SQL-Client layer.
> Please provide your feedback.

This message was sent by Atlassian JIRA

View raw message