beam-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukasz Cwik <lc...@google.com>
Subject Re: [Forked] BEAM-4046 (was [PROPOSAL] Introduce beam-sdks-java gradle project)
Date Wed, 03 Apr 2019 15:54:13 GMT
As a minor point, we do have some cross language dependencies, for example:
* the portability related proto projects are intended to be consumed by Go,
Java and Python
* the docker container gradle projects contain other applications (e.g. go
boot code) that are placed inside the docker container that contain the
language specific SDK harness. There will likely be additional applications
that are separate from the SDK harness like a docker container health
checker that are placed in there as well

On Tue, Apr 2, 2019 at 3:21 PM Michael Luckey <adude3141@gmail.com> wrote:

> Hi,
>
> agree with Kenn, that this issue at least renders the default
> implementation difficult to use.
>
> Although in the example given, i.e. having  sdks/java/core and
> sdks/py/core, I am unsure, whether it will impose a problem.
>
> As far as I understand until now, the issue triggers on dependency
> declaration. These are - in general - expressed with 3 dimensional maven
> coordinates GroupID, artifactID and version. Of course - semantic of
> version is clear - there are only 2 dimension left to distinguish
> artefacts. As we use a single group id (org.apache.beam) there is only one
> dimension left.
>
> Now this does not impose a problem on plain library dependencies. Of
> course they are uniquely defined. But we are using also lots of project
> dependencies. This project dependencies are translated from project path to
> those maven coordinates. Unfortunately here the project name - which
> happens to be the folder name - is used as artefact id. So if folder names
> match, we might get collisions during dependency resolution.
>
> Clearly, it is not possible to deploy artefacts with those same ids to any
> maven rep expecting sensible results. So we do either not deploy an
> artefact from one of these projects - which would kind of strange as we do
> have a project dependency here - or do rewrite the artefact id of (at
> least) one of the colliding projects. ( we currently do that implicitly
> with the project name we create by flattening our structure)
>
> Now back to the given example, as I do not expect any java project to have
> a project dependency on python, there might be a chance, that this will
> just work.
>
> But of course, this does not really help, as we reasonably might expect
> some /runner/direct/core or sdks/java/io/someio/core which would collide in
> the same way.
>
> As a possible workaround here, we could
> - either require unique folder names
> - or rewrite only colliding project names (as we currently do for all
> projects)
> - or ... (do not know yet)
>
> I suggest I ll invest some time playing around improving that already
> prepared PR to support discussion. So that we have proper grounding to
> decide whether a more hierarchical project structure will be worth that
> hassle.
>
> Looking at the gradle issue - which is already 2 yrs old and iirc was
> reported already at least one year earlier - I do not expect a fix here
> soon.
>
> On Tue, Apr 2, 2019 at 7:19 PM Lukasz Cwik <lcwik@google.com> wrote:
>
>> I didn't know that https://github.com/gradle/gradle/issues/847 existed
>> but the description of the issues people are having are similar to what was
>> discovered during the gradle migration.
>>
>> On Tue, Apr 2, 2019 at 8:02 AM Jean-Baptiste Onofré <jb@nanthrax.net>
>> wrote:
>>
>>> Hi Michael,
>>>
>>> no problem for the thread, that's the goal of the mailing list ;)
>>>
>>> And yes, you got my idea about a "meta" module: easy way of building the
>>> "whole" Java SDK.
>>>
>>> The purpose is not to create a uber jar, but more to simplify the build
>>> for Java SDK developers.
>>>
>>> Do you want me to complete your PR with what I did ?
>>>
>>> Regards
>>> JB
>>>
>>> On 02/04/2019 16:49, Michael Luckey wrote:
>>> > Going to fork the BEAM-4046 discussion. And, JB, I apologise for
>>> > hijacking your thread.
>>> >
>>> > As for the original question, I understood a request for a meta project
>>> > which will enable easier handling of java projects. E.g. instead of
>>> > requiring the user to call
>>> >
>>> >     ./gradlew module1:build module2:build ... moduleN.build
>>> >
>>> > a meta project with build task defined something about
>>> >
>>> > build.dependsOn module1:build
>>> > build.dependsOn module2:build
>>> > ...
>>> > build.dependsOn moduleN:build
>>> >
>>> > And other task as found usable.
>>> >
>>> > Not a project which in itself creates some uberjar, which I also
>>> believe
>>> > would rather difficult to implement.
>>> >
>>> > On Tue, Apr 2, 2019 at 5:13 AM Kenneth Knowles <kenn@apache.org
>>> > <mailto:kenn@apache.org>> wrote:
>>> >
>>> >     Oh, yikes. It seems
>>> >     like https://github.com/gradle/gradle/issues/847 indicates that
>>> the
>>> >     feature to use the default names in Gradle is practically
>>> >     nonfunctional. If that bug is as severe as it looks, I have to
>>> >     retract my position. Like we could never have sdks/java/core and
>>> >     sdks/py/core, right?
>>> >
>>> >     Kenn
>>> >
>>> >     On Mon, Apr 1, 2019 at 6:27 PM Michael Luckey <adude3141@gmail.com
>>> >     <mailto:adude3141@gmail.com>> wrote:
>>> >
>>> >         FWIW, hacked something as showcase for BEAM-4046 [1]
>>> >
>>> >         This is miserably broken, but a
>>> >
>>> >         ./gradlew projects
>>> >
>>> >         or
>>> >
>>> >         ./gradlew -p sdks/java build
>>> >
>>> >         should work. Anything else is likely to cause issues. If u hit
>>> >         stack overflow exception, it's likely caused
>>> >         by https://github.com/gradle/gradle/issues/847
>>> >
>>> >         To continue here, lots of cleanup has to be done. We might also
>>> >         need to rename folders etc, do better reflect semantic
>>> intentions.
>>> >
>>> >         [1] https://github.com/apache/beam/pull/8194
>>> >
>>> >         On Mon, Apr 1, 2019 at 11:56 PM Kenneth Knowles <
>>> kenn@apache.org
>>> >         <mailto:kenn@apache.org>> wrote:
>>> >
>>> >
>>> >
>>> >             On Mon, Apr 1, 2019 at 2:20 PM Lukasz Cwik <
>>> lcwik@google.com
>>> >             <mailto:lcwik@google.com>> wrote:
>>> >
>>> >
>>> >
>>> >                 On Mon, Apr 1, 2019 at 2:00 PM Kenneth Knowles
>>> >                 <kenn@apache.org <mailto:kenn@apache.org>> wrote:
>>> >
>>> >
>>> >                     As to building an aggregated "Java" project, I
>>> think
>>> >                     the blocker will be supporting conflicting deps.
>>> For
>>> >                     IOs like ElasticSearch and runners like Flink the
>>> >                     conflict is essential and deliberate, to support
>>> >                     multiple versions of other services. And that is
>>> not
>>> >                     even talking about transitive dep conflicts. I
>>> think
>>> >                     Python and Go don't have this issue simply because
>>> >                     they haven't tackled those problems.
>>> >
>>> >                     Are you talking about just a shortcut for building
>>> >                     (super easy to just add since we are using Gradle)
>>> >                     or a new artifact that you want to distribute?
>>> >
>>> >                     On Mon, Apr 1, 2019 at 10:01 AM Lukasz Cwik
>>> >                     <lcwik@google.com <mailto:lcwik@google.com>>
>>> wrote:
>>> >
>>> >                         During the gradle migration, we used to have
>>> >                         something like:
>>> >
>>> >                         include(":sdks:java:core")
>>> >                         include(":sdks:java:extensions:sql")
>>> >                         include(":sdks:python")
>>> >
>>> >                     Just to be super clear, this is Gradle default and
>>> >                     is equivalent to just leaving it blank.
>>> >
>>> >
>>> >                         but we discovered the Maven module names that
>>> >                         were used during publishing were "core" / "sql"
>>> >                         / ... (effectively the directory name) instead
>>> >                         of "beam-sdks-java-core".
>>> >
>>> >
>>> >                     Isn't this managed by the publication
>>> >                     plugin?
>>> https://docs.gradle.org/current/userguide/publishing_maven.html#sec:identity_values_in_the_generated_pom
>>>  "overriding
>>> >                     the default identity values is easy: simply specify
>>> >                     the groupId, artifactId or version attributes when
>>> >                     configuring the MavenPublication."
>>> >
>>> >
>>> >                 During the gradle migration this wasn't that easy. The
>>> >                 new maven publish plugin improved a lot since then.
>>> >
>>> >
>>> >                         Using the default at the time also broke the
>>> >                         artifact names for intra project dependencies
>>> >                         that we generate[1]. Finally, we also ran into
>>> >                         an issue because we had more then one Gradle
>>> >                         project with the same directory name even
>>> though
>>> >                         they were under a different parent folder (I
>>> >                         think it was "core") and that was leading to
>>> >                         some strange build time behavior.
>>> >
>>> >
>>> >                     Weird. But I think the Jira should still stand as a
>>> >                     move towards simplifying our build and making it
>>> >                     more discoverable for new contributors.
>>> >
>>> >
>>> >                 Agree on the JIRA makes sense, just calling out that
>>> >                 there were other issues that this naming had caused in
>>> >                 the past which should be checked before we call this
>>> done.
>>> >
>>> >
>>> >             Totally agree. It will be quite a large task with a lot of
>>> >             boilerplate that might not be separable from technical
>>> >             blockers that come up as you go through the boilerplate.
>>> >
>>> >             Kenn
>>> >
>>> >
>>> >
>>> >                     Kenn
>>> >
>>> >
>>> >                         We didn't migrate to a flat project structure
>>> >                         where each project is a folder underneath the
>>> >                         root project because of the existing Maven
>>> build
>>> >                         rules that were being maintained in parallel
>>> and
>>> >                         I'm not sure if people would want to have a
>>> flat
>>> >                         project structure either.
>>> >
>>> >                         1:
>>> https://github.com/apache/beam/blob/a85ea07b719385ec185e4fc5e4cdcc67b3598599/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1055
>>> >
>>> >                         On Mon, Apr 1, 2019 at 9:49 AM Michael Luckey
>>> >                         <adude3141@gmail.com
>>> >                         <mailto:adude3141@gmail.com>> wrote:
>>> >
>>> >                             Hi,
>>> >
>>> >                             although I did not yet manage to get deeper
>>> >                             involved into actual development, I think
>>> >                             this ability would be a useful addition.
>>> >
>>> >                             But I would also like to point out, that
>>> >                             this is kind of implicit, as soon we
>>> >                             get
>>> https://issues.apache.org/jira/browse/BEAM-4046
>>> >                             included.
>>> >
>>> >                             For instance, we would change the current
>>> >                             setup from
>>> >
>>> >                             include "beam-sdks-java-core"
>>> >                             project(":beam-sdks-java-core").dir =
>>> file("sdks/java/core")
>>> >
>>> >                             to something like
>>> >
>>> >                             include(":sdks:java:core")
>>> >                             include(":sdks:java:extensions:sql")
>>> >                             include(":sdks:python")
>>> >
>>> >
>>> >                             With this in place a plain
>>> >
>>> >                             $ ./gradlew -p sdks/java build
>>> >
>>> >
>>> >                             would exactly do what you want. And, of
>>> >                             course, this will also work for
>>> >                             'sdks/java/io', 'runners/' etc. Hope, you
>>> >                             get the point.
>>> >
>>> >                             Currently, we deviate from gradle default
>>> >                             convention and therefore have to implement
>>> >                             some quirks to restore default behaviour.
>>> >                             And I somehow dislike the structure
>>> >                             introduced by parent/child folders, which
>>> >                             will be destroyed by our current project
>>> >                             definitions.
>>> >
>>> >                             But, to be honest, although I have some
>>> >                             clear understanding on how to proceed here
>>> -
>>> >                             especially regarding the requirement to
>>> keep
>>> >                             the change backwards compatible - we might
>>> >                             decide not to switch. Because deeper
>>> >                             investigation might reveal issues, which I
>>> >                             am currently not aware of.
>>> >
>>> >                             Best,
>>> >
>>> >                             michel
>>> >
>>> >                             On Mon, Apr 1, 2019 at 5:52 PM
>>> Jean-Baptiste
>>> >                             Onofré <jb@nanthrax.net
>>> >                             <mailto:jb@nanthrax.net>> wrote:
>>> >
>>> >                                 Hi guys,
>>> >
>>> >                                 I would like to introduce a Gradle
>>> >                                 "meta" project for the build:
>>> >                                 beam-sdks-java.
>>> >
>>> >                                 The idea is to simply build all Java
>>> SDK
>>> >                                 related resources (core, IO, ...).
>>> >
>>> >                                 The purpose is also to be aligned with
>>> >                                 the other SDKs which provide
>>> >                                 beam-sdks-go and beam-sdks-python.
>>> >
>>> >                                 Thoughts ?
>>> >
>>> >                                 Regards
>>> >                                 JB
>>> >                                 --
>>> >                                 Jean-Baptiste Onofré
>>> >                                 jbonofre@apache.org
>>> >                                 <mailto:jbonofre@apache.org>
>>> >                                 http://blog.nanthrax.net
>>> >                                 Talend - http://www.talend.com
>>> >
>>>
>>> --
>>> Jean-Baptiste Onofré
>>> jbonofre@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>

Mime
View raw message