flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: [DISCUSS] Policy on keeping layer alternatives in sync
Date Fri, 26 Sep 2014 09:04:59 GMT
Hey Fabian,

thanks for bringing this up.

I would vote to have a hard policy regarding the Scala and Java API as these are our main
user facing APIs.

If there was a fundamental problem or language feature, which could not be supported/ported
in/to the other API, I would be OK if it was only available in one. But small additions to
the APIs like outer joins, which can be in sync should also be in sync.

If someone does not want to add the corresponding feature to the other APIs, I would go for
a pull request with a request for someone else to port the missing part it.

I think it is very important for users to be able to assume that all APIs have the same "power".
Otherwise we might end up in a situation (and I think we already had it with the broadcast
variables for a time), where users have to pick the API, which matches their use case and
not their preference.



On 26 Sep 2014, at 10:43, Fabian Hueske <fhueske@apache.org> wrote:

> Hi,
> as you all know, Flink has a layered architecture with multiple
> alternatives for certain levels.
> Exampels are:
> - Programming APIs: Java, Scala, (and Python in progress)
> - Processing Backends: distributed runtime (former Nephele), Java
> Collections, (and potentially Tez in the future)
> The challenge with multiple alternatives that serve the same purpuse is
> that these should be in sync.
> A feature that is added to the Java API should also be added to the Scala
> API (and other APIs in the future). The same applies to new runtime
> strategies and operators, such as outer joins.
> I think we need a policy how to keep the features of different layer
> alternatives in sync.
> With the recent update of the Scala API, a ScalaAPICompletenessTest was
> added that checks whether the Scala API offers the same methods as the Java
> API. Adding a feature to the Java API breaks the build and requires to
> either adapt the Scala API as well or exclude the added methods from the
> APICompletenessTest.
> While this test is a great tool to make sure that that APIs are synced,
> this basically requires that APIs are always synced, i.e., a modification
> of the Java API must go with an equivalent change of the Scala API.
> If we make this a tight policy and force compatibility at all times,
> contributors must know about several different technologies (Scala Compiler
> Macros, Python, the implementation details of multiple runtime backends,
> ...). This sounds like a huge entrance barrier to me.
> To make it clear, I am definitely in favor of keeping APIs and backends in
> sync.
> However, I propose to enforce this only for releases, i.e., allow
> out-of-sync APIs on the master branch and fix the APIs for releases.
> With this additional requirement, we also need to think twice which
> features to add as multiple components of the system will be affected.
> What do you guys think?

View raw message