flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guodong Wang <wangg...@gmail.com>
Subject How to create schema for flexible json data in Flink SQL
Date Thu, 28 May 2020 10:31:55 GMT
Hi !

I want to use Flink SQL to process some json events. It is quite
challenging to define a schema for the Flink SQL table.

My data source's format is some json like this
{
"top_level_key1": "some value",
"nested_object": {
"nested_key1": "abc",
"nested_key2": 123,
"nested_key3": ["element1", "element2", "element3"]
}
}

The big challenges for me to define a schema for the data source are
1. the keys in nested_object are flexible, there might be 3 unique keys or
more unique keys. If I enumerate all the keys in the schema, I think my
code is fragile, how to handle event which contains more  nested_keys in
nested_object ?
2. I know table api support Map type, but I am not sure if I can put
generic object as the value of the map. Because the values in nested_object
are of different types, some of them are int, some of them are string or
array.

So. how to expose this kind of json data as table in Flink SQL without
enumerating all the nested_keys?

Thanks.

Guodong

Mime
View raw message