flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Explanation on limitations of the Flink Table API
Date Thu, 21 Apr 2016 13:06:33 GMT
We're also trying to work around the current limitations of Table API and
we're reading DataSets with on-purpose input formats that returns a POJO
Row containing the list of values (but we're reading all values as
String...).
Actually we would also need a way to abstract the composition of Flink
operators and UDFs to compose a transformation from a Graphical UI or from
a script..during the Stratosphere project there was Meteor and Supremo
allowing that [1] but then it was dismissed in favour of Pig integration
that I don't wheter it was ever completed..some days ago I discovered
Piglet project[2] that allows to use PIG with Spark and Flink but I don't
know how well it works (Flink integration is also very recent and not
documented anywhere).

Best,
Flavio

[1] http://stratosphere.eu/assets/papers/Sopremo_Meteor%20BigData.pdf
[2] https://github.com/ksattler/piglet

On Thu, Apr 21, 2016 at 2:41 PM, Fabian Hueske <fhueske@gmail.com> wrote:

> Hi Simone,
>
> in Flink 1.0.x, the Table API does not support reading external data,
> i.e., it is not possible to read a CSV file directly from the Table API.
> Tables can only be created from DataSet or DataStream which means that the
> data is already converted into "Flink types".
>
> However, the Table API is currently under heavy development as part of the
> the efforts to add SQL support.
> This work is taking place on the master branch and I am currently working
> on interfaces to scan external data sets or ingest external data streams.
> The interface will be quite generic such that it should be possible to
> define a table source that reads the first lines of a file to infer
> attribute names and types.
> You can have a look at the current state of the API design here [1].
>
> Feedback is welcome and can be very easily included in this phase of the
> development ;-)
>
> Cheers, Fabian
>
> [1]
> https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0
> <https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0/edit#>
>
> 2016-04-21 14:26 GMT+02:00 Simone Robutti <simone.robutti@radicalbit.io>:
>
>> Hello,
>>
>> I would like to know if it's possible to create a Flink Table from an
>> arbitrary CSV (or any other form of tabular data) without doing type safe
>> parsing with expliciteky type classes/POJOs.
>>
>> To my knowledge this is not possible but I would like to know if I'm
>> missing something. My requirement is to be able to read a CSV file and
>> manipulate it reading the field names from the file and inferring data
>> types.
>>
>> Thanks,
>>
>> Simone
>>
>
>

Mime
View raw message