flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <ches...@apache.org>
Subject [DISCUSS] Removal of twitter-inputformat
Date Wed, 07 Jun 2017 09:15:01 GMT
Hello,

I'm proposing to remove the Twitter-InputFormat in FLINK-6710 
<https://issues.apache.org/jira/browse/FLINK-6710>, with an open PR you 
can find here <https://github.com/apache/flink/pull/3984>.
The PR currently has a +1 from Robert, but Timo raised some concerns 
saying that it is useful for prototyping and
advised me to start a discussion on the ML.

This format is a DelimitedInputFormat that reads JSON objects and turns 
them into a custom tweet class.
I believe this format doesn't provide much value to Flink; there's 
nothing interesting about it as an InputFormat,
as it is purely an exercise in /manually /converting a JSON object into 
a POJO.
This is apparent since you could just as well use 
ExecutionEnvironment#readTextFile(...) and throw the parsing logic
into a subsequent MapFunction.

In the PR i suggested to replace this with a JsonInputFormat, but this 
was a misguided attempt at getting Timo to agree
to the removal. This format has the same problem outlined above, as it 
could be effectively implemented with a one-liner map function.

So the question now is whether we want to keep it, remove it, or replace 
it with something more general.

Regards,
Chesnay

Mime
View raw message