flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Walther <twal...@apache.org>
Subject Re: [DISCUSS] Project build time and possible restructuring
Date Mon, 20 Mar 2017 12:48:11 GMT
I agress with Aljoscha that we might consider moving from Jenkins to 
Travis. Is there any disadvantage in using Jenkins?

I think we should structure the project according to release management 
(e.g. more frequent releases of libraries) or other criteria (e.g. core 
and non-core) instead of build time. What would happen if the built of 
another submodule would become too long, would we split/restructure 
again and again? If Jenkins solves all our problems we should use it.

Regards,
Timo



Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
> I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins integration, has opened my eyes to what is possible with good CI integration.
>
> For example, look at this recent Beam PR: https://github.com/apache/beam/pull/2263 <https://github.com/apache/beam/pull/2263>. The Jenkins-Github integration will tell you exactly which tests failed and if you click on the links you can look at the log output/std out of the tests in question.
>
> This is the overview page of one of the Jenkins Jobs that we have in Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a stable build: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/>. Notice how it gives you fine grained information about the Maven run. This is an unstable run: https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/ <https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There you can see which tests failed and you can easily drill down.
>
> Best,
> Aljoscha
>
>> On 20 Mar 2017, at 11:46, Robert Metzger <rmetzger@apache.org> wrote:
>>
>> Thank you for looking into the build times.
>>
>> I didn't know that the build time situation is so bad. Even with yarn, mesos, connectors and libraries removed, we are still running into the build timeout :(
>>
>> Aljoscha told me that the Beam community is using Jenkins for running the tests, and they are planning to completely move away from Travis. I wonder whether we should do the same, as having our own Jenkins servers would allow us to run tests for more than 50 minutes.
>>
>> I agree with Stephan that we should keep the yarn and mesos tests in the core for stability / testing quality purposes.
>>
>>
>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <sewen@apache.org <mailto:sewen@apache.org>> wrote:
>> @Greg
>>
>> I am personally in favor of splitting "connectors" and "contrib" out as
>> well. I know that @rmetzger has some reservations about the connectors, but
>> we may be able to convince him.
>>
>> For the cluster tests (yarn / mesos) - in the past there were many cases
>> where these tests caught cases that other tests did not, because they are
>> the only tests that actually use the "flink-dist.jar" and thus discover
>> many dependency and configuration issues. For that reason, my feeling would
>> be that they are valuable in the core repository.
>>
>> I would actually suggest to do only the library split initially, to see
>> what the challenges are in setting up the multi-repo build and release
>> tooling. Once we gathered experience there, we can probably easily see what
>> else we can split out.
>>
>> Stephan
>>
>>
>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <code@greghogan.com <mailto:code@greghogan.com>> wrote:
>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>> With 51 builds queued up for the weekend (some of which may fail or have
>>> been force pushed) we are at the limit of the number of contributions we
>>> can process. Fixing this requires 1) splitting the project, 2)
>>> investigating speedups for long-running tests, and 3) staying cognizant of
>>> test performance when accepting new code.
>>>
>>> I’d like to add one to Stephan’s list of module group. I like that the
>>> modules are generic (“libraries”) so that no one module is alone and
>>> independent.
>>>
>>> Flink has three “libraries”: cep, ml, and gelly.
>>>
>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>> connectors for three Kafka versions).
>>>
>>> Both flink-storm and flink-python have a modest number of number of tests
>>> and could live with the miscellaneous modules in “contrib”.
>>>
>>> The YARN tests are long-running and problematic (I am unable to
>>> successfully run these locally). A “cluster” module could host flink-mesos,
>>> flink-yarn, and flink-yarn-tests.
>>>
>>> That gets us close to running all tests in a single Travis build.
>>>    https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590> <
>>> https://travis-ci.org/greghogan/flink/builds/212122590 <https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>
>>> I also tested (https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build> <
>>> https://github.com/greghogan/flink/commits/core_build <https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>    https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659> <
>>> https://travis-ci.org/greghogan/flink/builds/212137659 <https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>    https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470> <
>>> https://travis-ci.org/greghogan/flink/builds/212154470 <https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>
>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>
>>> I also wanted to get an idea of how disruptive it would be to developers
>>> to divide the project into multiple git repos. I wrote a simple python
>>> script and configured it with the module partitions listed above. The usage
>>> string from the top of the file lists commits with files from multiple
>>> partitions and well as the modified files.
>>>    https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>
>>> Accounting for the merging of the batch and streaming connector modules,
>>> and assuming that the project structure has not changed much over the past
>>> 15 months, for the following date ranges the listed number of commits would
>>> have been split across repositories.
>>>
>>> since "2017-01-01"
>>> 36 of 571 commits were mixed
>>>
>>> since "2016-07-01"
>>> 155 of 1607 commits were mixed
>>>
>>> since "2016-01-01"
>>> 272 of 2561 commits were mixed
>>>
>>> Greg
>>>
>>>
>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <sewen@apache.org <mailto:sewen@apache.org>> wrote:
>>>>
>>>> @Robert - I think once we know that a separate git repo works well, and
>>>> that it actually solves problems, I see no reason to not create a
>>>> connectors repository later. The infrastructure changes should be
>>> identical
>>>> for two or more repositories.
>>>>
>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <trohrmann@apache.org <mailto:trohrmann@apache.org>>
>>> wrote:
>>>>> I think it should not be at least the flink-dist but exactly the
>>> remaining
>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>
>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <rmetzger@apache.org <mailto:rmetzger@apache.org>>
>>>>> wrote:
>>>>>
>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>
>>>>>> When doing a release, we need to build the flink main code first,
>>> because
>>>>>> the flink-libraries depend on that.
>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>> again
>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>> from
>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>> artifact.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <trohrmann@apache.org <mailto:trohrmann@apache.org>>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm ok with point 3.
>>>>>>>
>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>> having
>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <rmetzger@apache.org <mailto:rmetzger@apache.org>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>> before, I
>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>
>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>> introduces
>>>>>>> too
>>>>>>>> much complexity and too many repositories.
>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>> time
>>>>>>>> significantly down.
>>>>>>>> We can also consider putting the connectors into the
>>>>> "flink-libraries"
>>>>>>> repo
>>>>>>>> if we need to further reduce the build time.
>>>>>>>>
>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if we
>>>>>> want
>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>> "flink-libraries" module from main.
>>>>>>>>
>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>> placed
>>>>>>> in
>>>>>>>> contrib anymore.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <code@greghogan.com <mailto:code@greghogan.com>>
>>>>>> wrote:
>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>
>>>>>>>>> We should compare the verification time with and without the listed
>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>
>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>> flink-libraries?
>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>> (and
>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>
>>>>>>>>> Greg
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <rmetzger@apache.org <mailto:rmetzger@apache.org>
>>>>>>>> wrote:
>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>
>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>> feasible
>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>> the
>>>>>>>>>> libraries.
>>>>>>>>>>
>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>
>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <https://git-wip-us.apache.org/>
>>>>>>>> repos/asf?p=flink-
>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>> "flink-libraries"
>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>> "flink-cep",
>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>> decided
>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>> contrib
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>> repo
>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>> the
>>>>>>>>>> future)
>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>> them
>>>>>>> into
>>>>>>>>> the
>>>>>>>>>> new repo
>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>> repo.
>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>> repository,
>>>>>>>>>> similar to the main documentation.
>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>> documentations
>>>>>>>>>> & link them to each other
>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>> repositories
>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>> of
>>>>>>>> both
>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>> the
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>> first
>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>> with
>>>>>>> the
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>
>>>>>>>>>> The main question for the community is: do you agree with point
>>>>> 3 ?
>>>>>>>> Would
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>
>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>> trohrmann@apache.org <mailto:trohrmann@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>> of
>>>>>>> the
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>> +1s,
>>>>>>> the
>>>>>>>>> bot
>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>> merge
>>>>>>>>>>> process.
>>>>>>>>>>>
>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>> there
>>>>>> is
>>>>>>>> not
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>> it
>>>>>>>>> lives
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>> the
>>>>>>>> core
>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>> which
>>>>>>> are
>>>>>>>>> not
>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>> be
>>>>>>>>> noticed
>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>
>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>> capacities to
>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>> code
>>>>>>>> in
>>>>>>>>> a
>>>>>>>>>>> single repository.
>>>>>>>>>>>
>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>> some
>>>>>>>> nice
>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>> beneficial
>>>>>>>>> for
>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>> on
>>>>>>>>> Travis.
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>> reuse
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>> however, it
>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>> Gradle
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>> we
>>>>>>>>> might
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>> repository
>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>> build
>>>>>>>>> time in
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>> be
>>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Till
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <sewen@apache.org <mailto:sewen@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>> not
>>>>>>> sure
>>>>>>>>> for
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>> a
>>>>>>> lot
>>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>> was
>>>>>> a
>>>>>>>>> commit
>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>> the
>>>>>>>>>>> meantime.
>>>>>>>>>>>>    For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>> eventually
>>>>>>>>>>>> as well.
>>>>>>>>>>>>
>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>> repository/modules
>>>>>>>>> breaks
>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>
>>>>>>>>>>>> Stephan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>> trohrmann@apache.org <mailto:trohrmann@apache.org>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>> I'd
>>>>>>>> like
>>>>>>>>>>> to
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>> project
>>>>>>>>> has
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>> repository
>>>>>>>>>>>> using
>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>> support
>>>>>>>> this
>>>>>>>>>>> we
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>> for
>>>>>>>>>>>> example.
>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>> sets
>>>>>> of
>>>>>>>>>>>> modules
>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>> alternative
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>> actually
>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>> and
>>>>>> it
>>>>>>>>>>> would
>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>> integration
>>>>>>>>> tests
>>>>>>>>>>>> if
>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>> the
>>>>>>>>>>>> community
>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>> this
>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>> could
>>>>>>>>>>> become
>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>> Maven,
>>>>>>>> I
>>>>>>>>>>>> still
>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>> example,
>>>>>>>> could
>>>>>>>>>>> be
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>> make
>>>>>>>> it
>>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>> The
>>>>>>> main
>>>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>> docs
>>>>>>>> from
>>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>> Flink's
>>>>>>>> ML
>>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>> whether
>>>>>>>>>>> it
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>> If
>>>>>>> that
>>>>>>>>>>>> should
>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>> first
>>>>>>>> only
>>>>>>>>>>>> for
>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>> I'm
>>>>>>>> done
>>>>>>>>>>>> with
>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>> rmetzger@apache.org <mailto:rmetzger@apache.org>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>> open
>>>>>>>>>>> source
>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>> back
>>>>>>>> then
>>>>>>>>>>>> when
>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>> available
>>>>>>>>>>> with
>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>> I've
>>>>>>>>>>>> recently
>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>> time,
>>>>>>>>>>>>>> introducing build profiles for different components as
>>>>> Aljoscha
>>>>>>>>>>>> suggested
>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>> the
>>>>>>>>>>>>> testing,
>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>> competitors
>>>>>>>>>>> to
>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>> source
>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>> one
>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>> well,
>>>>>>>> then
>>>>>>>>>>> I
>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>> different
>>>>>>>>>>>>>> repositories.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>> opinion.
>>>>>> As
>>>>>>>>>>>> others
>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>> things:
>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>> repo
>>>>>>>>>>>> should
>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>> building
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>> repository
>>>>>>>>>>>> depend
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>> snapshot
>>>>>>>>>>>>> deployment
>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>> library
>>>>>>>>>>>> repository
>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>> one
>>>>>>>>>>>>> committer
>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>> example
>>>>>>> for
>>>>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>> currently
>>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>> realistically
>>>>>>> see
>>>>>>>>>>>> myself
>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>> some
>>>>>>>> time
>>>>>>>>>>>> off.
>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>> 5
>>>>>>>> days.
>>>>>>>>>>>> The
>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>> stuff,
>>>>>> so
>>>>>>>>>>> many
>>>>>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
>>>>>> release
>>>>>>>>>>>>> scripts,
>>>>>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
>>>>>> sewen@apache.org <mailto:sewen@apache.org>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> If we can get a incremental builds to work, that would
>>>>>> actually
>>>>>>> be
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> preferred solution in my opinion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many companies have invested heavily in making a "single
>>>>>>>>>>> repository"
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> base work, because it has the advantage of not having to
>>>>>>>>>>>> update/publish
>>>>>>>>>>>>>>> several repositories first.
>>>>>>>>>>>>>>> However, the strong prerequisite for that is an incremental
>>>>>>> build
>>>>>>>>>>>>> system
>>>>>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
>>>>> not
>>>>>>>> sure
>>>>>>>>>>>> how
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>> could make that work
>>>>>>>>>>>>>>> with Maven and Travis...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
>>>>>>> code@greghogan.com <mailto:code@greghogan.com>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> An additional option for reducing time to build and test is
>>>>>>>>>>>> parallel
>>>>>>>>>>>>>>>> execution. This would help users more than on TravisCI
>>>>> since
>>>>>>>>>>> we're
>>>>>>>>>>>>>>>> generally running on multi-core machines rather than VM
>>>>>> slices.
>>>>>>>>>>>>>>>> Is the idea that each user would only check out the modules
>>>>>>> that
>>>>>>>>>>> he
>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> she
>>>>>>>>>>>>>>>> is developing with? For example, if a developer is not
>>>>>> working
>>>>>>> on
>>>>>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
>>>>>> would
>>>>>>>>>>> not
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> clone
>>>>>>>>>>>>>>>> to their filesystem?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
>>>>> validate
>>>>>>>>>>>> against
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
>>>>>>>>>>> fhueske@gmail.com <mailto:fhueske@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think this should be a discussion about the benefits and
>>>>>>>>>>>>> drawbacks
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> separating the code into distinct repositories from a
>>>>>>>>>>> development
>>>>>>>>>>>>>> point
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> view.
>>>>>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
>>>>>>> community
>>>>>>>>>>>> by
>>>>>>>>>>>>>>>> creating
>>>>>>>>>>>>>>>>> separate groups of committers.
>>>>>>>>>>>>>>>>> Also the discussion about independent releases is not be
>>>>>>>>>>> strictly
>>>>>>>>>>>>>>> related
>>>>>>>>>>>>>>>>> to the decision, IMO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
>>>>> before:
>>>>>>>>>>>>>>>>> pros:
>>>>>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
>>>>>> the
>>>>>>>>>>>>> whole
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> base to run a test after switching a branch.
>>>>>>>>>>>>>>>>> cons:
>>>>>>>>>>>>>>>>> - developing libraries features that require changes in
>>>>> the
>>>>>>>>>>> core
>>>>>>>>>>>> /
>>>>>>>>>>>>>> APIs
>>>>>>>>>>>>>>>>> become more time consuming due to back-and-forth between
>>>>>> code
>>>>>>>>>>>>> bases.
>>>>>>>>>>>>>>>>> However, I think this is not very often the case.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
>>>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> solved
>>>>>>>>>>>>>>>>> by different build profiles and configurations.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
>>>>>>>>>>> mail@gaborhermann.com <mailto:mail@gaborhermann.com>
>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>> @Stephan:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
>>>>>>>>>>>> committers,
>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
>>>>>>>>>>> just
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> conscious about the disadvantages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
>>>>>>>>>>>> current
>>>>>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
>>>>>> resolve
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>>>> But that requires time from current committers. It seems
>>>>>> like
>>>>>>>>>>>>>>>> trade-offs
>>>>>>>>>>>>>>>>>> between code quality, speed of development, and committer
>>>>>>>>>>>>> efforts.
>>>>>>>>>>>>>>>>>>  From what I see in the discussion about ML, there are
>>>>> many
>>>>>>>>>>>> people
>>>>>>>>>>>>>>>> willing
>>>>>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
>>>>>> we
>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> should move forward. However, the development speed is
>>>>>>>>>>>>>> significantly
>>>>>>>>>>>>>>>>> slowed
>>>>>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
>>>>> helping
>>>>>>>>>>> the
>>>>>>>>>>>>>>> review
>>>>>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
>>>>>> either
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> quality
>>>>>>>>>>>>>>>>>> (by more easily accepting new committers) or some
>>>>> committer
>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
>>>>>> As
>>>>>>>>>>>> Till
>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> indicated, it would be shameful if we let this
>>>>> contribution
>>>>>>>>>>>>> effort
>>>>>>>>>>>>>>> die.
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> Gabor
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>
>


Mime
View raw message