flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: CREATE TABLE with Schema derived from format
Date Wed, 04 Mar 2020 13:34:39 GMT
Hi!

Initially we were looking at 2) but 1) would be the best solution. I think
both are would be very valuable.

My only concern related to using the Schema Registry as a Catalog is the
interaction with other Catalogs in the system. Maybe you are using a Hive
catalog to track a bunch of tables, and now you would have to switch to the
Schema Registry.
Maybe in this case it would be good to be able to import tables from one
catalog to another.

Gyula


On Wed, Mar 4, 2020 at 2:24 PM Jark Wu <imjark@gmail.com> wrote:

> Yes. From my perspective, deriving schema from schema registry is the most
> important use case of FLINK-16420.
>
> Some initial idea about this:
> 1) introduce a SchemaRegisteryCatalog to allow users run queries on
> existing topics without manual table definition. see FLINK-12256
> 2) provide a connector property for schema registery url to derive schema
> from it, and the CREATE TABLE statement can leave out schema part, e.g.
>
> CREATE TABLE user_behavior WITH ("connector"="kafka",
> "topic"="user_behavior", "schema.registery.url"="localhost:8081")
>
> Which way are you looking for?
>
> Best,
> Jark
>
> On Wed, 4 Mar 2020 at 19:09, Gyula Fóra <gyula.fora@gmail.com> wrote:
>
>> Hi Jark,
>>
>> Thank you for the clarification this is exactly what I was looking for,
>> especially for the second part regarding schema registry integration.
>>
>> This question came up as we were investigating how the schema registry
>> integration should look like :)
>>
>> Cheers,
>> Gyula
>>
>> On Wed, Mar 4, 2020 at 12:06 PM Jark Wu <imjark@gmail.com> wrote:
>>
>>> Hi Gyula,
>>>
>>> That's a good point and is on the roadmap.
>>>
>>> In 1.10, JSON and CSV format can derive format schema from table schema.
>>> So you don't need to specify format schema in properties anymore if you are
>>> using 1.10.
>>>
>>> On the contrary, we are planning to derive table schema from format
>>> schema if it is specified, e.g. "format.fields", "format.avro-file-path".
>>> Furthermore, table schema can be inferenced if there is a schema
>>> registry or even read some data and infer it.
>>> I created FLINK-16420 to track this effort. But not sure we have enough
>>> time to support it before 1.11.
>>>
>>> Best,
>>> Jark
>>>
>>> [1]: https://issues.apache.org/jira/browse/FLINK-16420
>>>
>>>
>>> On Wed, 4 Mar 2020 at 18:21, Gyula Fóra <gyula.fora@gmail.com> wrote:
>>>
>>>> Hi All!
>>>>
>>>> I am wondering if it would be possible to change the CREATE TABLE
>>>> statement so that it would also work without specifying any columns.
>>>>
>>>> The format generally defines the available columns so maybe we could
>>>> simply use them as is if we want.
>>>>
>>>> This would be very helpful when exploring different data sources.
>>>>
>>>> Let me know what you think!
>>>> Gyula
>>>>
>>>

Mime
View raw message