flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhijiang" <wangzhijiang...@aliyun.com>
Subject Re: [DISCUSS] Adding a mid-term roadmap to the Flink website
Date Fri, 15 Feb 2019 02:58:02 GMT
Thanks Stephan for this proposal and I totally agree with it. 

It is very necessary to summarize the overall features/directions the community is going or
planning to go. Although I almost checked the mailing list everyday, it still seems difficult
to trace everything. In addtion I think this whole roadmap picture can also help expose the
relationships among different items, even avoid the similar/duplicated thoughts or works.

Just one small suggestion, if we coule add some existing link (jira/discussion/FLIP/google
doc) for each listed item, then it would be easy to keep trace of the interested one and handle
the progress of it.

From:Jeff Zhang <zjffdu@gmail.com>
Send Time:2019年2月14日(星期四) 18:03
To:Stephan Ewen <sewen@apache.org>
Cc:dev <dev@flink.apache.org>; user <user@flink.apache.org>; jincheng sun <sunjincheng121@gmail.com>;
Shuyi Chen <suez1224@gmail.com>; Rong Rong <walterddr@gmail.com>
Subject:Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Hi Stephan,

Thanks for this proposal. It is a good idea to track the roadmap. One
suggestion is that it might be better to put it into wiki page first.
Because it is easier to update the roadmap on wiki compared to on flink web
site. And I guess we may need to update the roadmap very often at the
beginning as there's so many discussions and proposals in community
recently. We can move it into flink web site later when we feel it could be
nailed down.

Stephan Ewen <sewen@apache.org> 于2019年2月14日周四 下午5:44写道:

> Thanks Jincheng and Rong Rong!
> I am not deciding a roadmap and making a call on what features should be
> developed or not. I was only collecting broader issues that are already
> happening or have an active FLIP/design discussion plus committer support.
> Do we have that for the suggested issues as well? If yes , we can add them
> (can you point me to the issue/mail-thread), if not, let's try and move the
> discussion forward and add them to the roadmap overview then.
> Best,
> Stephan
> On Wed, Feb 13, 2019 at 6:47 PM Rong Rong <walterddr@gmail.com> wrote:
>> Thanks Stephan for the great proposal.
>> This would not only be beneficial for new users but also for contributors
>> to keep track on all upcoming features.
>> I think that better window operator support can also be separately group
>> into its own category, as they affects both future DataStream API and batch
>> stream unification.
>> can we also include:
>> - OVER aggregate for DataStream API separately as @jincheng suggested.
>> - Improving sliding window operator [1]
>> One more additional suggestion, can we also include a more extendable
>> security module [2,3] @shuyi and I are currently working on?
>> This will significantly improve the usability for Flink in corporate
>> environments where proprietary or 3rd-party security integration is needed.
>> Thanks,
>> Rong
>> [1]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
>> [2]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>> [3]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html
>> On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <sunjincheng121@gmail.com>
>> wrote:
>>> Very excited and thank you for launching such a great discussion,
>>> Stephan !
>>> Here only a little suggestion that in the Batch Streaming Unification
>>> section, do we need to add an item:
>>> - Same window operators on bounded/unbounded Table API and DataStream
>>> API
>>> (currently OVER window only exists in SQL/TableAPI, DataStream API does
>>> not yet support)
>>> Best,
>>> Jincheng
>>> Stephan Ewen <sewen@apache.org> 于2019年2月13日周三 下午7:21写道:
>>>> Hi all!
>>>> Recently several contributors, committers, and users asked about making
>>>> it more visible in which way the project is currently going.
>>>> Users and developers can track the direction by following the
>>>> discussion threads and JIRA, but due to the mass of discussions and open
>>>> issues, it is very hard to get a good overall picture.
>>>> Especially for new users and contributors, is is very hard to get a
>>>> quick overview of the project direction.
>>>> To fix this, I suggest to add a brief roadmap summary to the homepage.
>>>> It is a bit of a commitment to keep that roadmap up to date, but I think
>>>> the benefit for users justifies that.
>>>> The Apache Beam project has added such a roadmap [1]
>>>> <https://beam.apache.org/roadmap/>, which was received very well by
>>>> the community, I would suggest to follow a similar structure here.
>>>> If the community is in favor of this, I would volunteer to write a
>>>> first version of such a roadmap. The points I would include are below.
>>>> Best,
>>>> Stephan
>>>> [1] https://beam.apache.org/roadmap/
>>>> ========================================================
>>>> Disclaimer: Apache Flink is not governed or steered by any one single
>>>> entity, but by its community and Project Management Committee (PMC). This
>>>> is not a authoritative roadmap in the sense of a plan with a specific
>>>> timeline. Instead, we share our vision for the future and major initiatives
>>>> that are receiving attention and give users and contributors an
>>>> understanding what they can look forward to.
>>>> *Future Role of Table API and DataStream API*
>>>>   - Table API becomes first class citizen
>>>>   - Table API becomes primary API for analytics use cases
>>>>       * Declarative, automatic optimizations
>>>>       * No manual control over state and timers
>>>>   - DataStream API becomes primary API for applications and data
>>>> pipeline use cases
>>>>       * Physical, user controls data types, no magic or optimizer
>>>>       * Explicit control over state and time
>>>> *Batch Streaming Unification*
>>>>   - Table API unification (environments) (FLIP-32)
>>>>   - New unified source interface (FLIP-27)
>>>>   - Runtime operator unification & code reuse between DataStream / Table
>>>>   - Extending Table API to make it convenient API for all analytical
>>>> use cases (easier mix in of UDFs)
>>>>   - Same join operators on bounded/unbounded Table API and DataStream
>>>> API
>>>> *Faster Batch (Bounded Streams)*
>>>>   - Much of this comes via Blink contribution/merging
>>>>   - Fine-grained Fault Tolerance on bounded data (Table API)
>>>>   - Batch Scheduling on bounded data (Table API)
>>>>   - External Shuffle Services Support on bounded streams
>>>>   - Caching of intermediate results on bounded data (Table API)
>>>>   - Extending DataStream API to explicitly model bounded streams (API
>>>> breaking)
>>>>   - Add fine fault tolerance, scheduling, caching also to DataStream API
>>>> *Streaming State Evolution*
>>>>   - Let all built-in serializers support stable evolution
>>>>   - First class support for other evolvable formats (Protobuf, Thrift)
>>>>   - Savepoint input/output format to modify / adjust savepoints
>>>> *Simpler Event Time Handling*
>>>>   - Event Time Alignment in Sources
>>>>   - Simpler out-of-the box support in sources
>>>> *Checkpointing*
>>>>   - Consistency of Side Effects: suspend / end with savepoint (FLIP-34)
>>>>   - Failed checkpoints explicitly aborted on TaskManagers (not only on
>>>> coordinator)
>>>> *Automatic scaling (adjusting parallelism)*
>>>>   - Reactive scaling
>>>>   - Active scaling policies
>>>> *Kubernetes Integration*
>>>>   - Active Kubernetes Integration (Flink actively manages containers)
>>>> *SQL Ecosystem*
>>>>   - Extended Metadata Stores / Catalog / Schema Registries support
>>>>   - DDL support
>>>>   - Integration with Hive Ecosystem
>>>> *Simpler Handling of Dependencies*
>>>>   - Scala in the APIs, but not in the core (hide in separate class
>>>> loader)
>>>>   - Hadoop-free by default

Best Regards

Jeff Zhang

View raw message