couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: On dependency management and CI issues associated with it
Date Wed, 13 Apr 2016 16:19:42 GMT

Keeping fauxton in a separate repo makes sense. It has a different release cycle. It's genuinely
decoupled. Getting all the Erlang into one repo is really the goal. 

With couch_epi as a core application, anyone can extend and customise couchdb by adding another
dependency.  At most, we might identify new places for epi hooks as we go. 

> On 13 Apr 2016, at 17:11, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> 
> Hello everybody!
> 
> Wow, 56 repos! Hopefully we get an award somewhere for that. I've
> listed the repositories below in some crude groups to try and give an
> idea of what we're working with. I have to agree that this is getting
> a bit on the ridiculous side. Of all of the repos that the ASF
> actually develops I'm only seeing four or so Erlang apps (b64url,
> config, couch-collate, and khash) that are likely truly re-usable
> outside of CouchDB without mucking about.
> 
> I have previously (years ago) played around with trying to go back to
> a single repository. It generally works fine, the only issue that I
> found was that rebar's {vsn, git} tag for *.app.src files doesn't work
> in a single repo and it gets a bit complicated managing that by hand.
> However, I think it would be possible to add something like rebar's
> deps files and a small custom tool that either a) breaks the build if
> versions haven't changed or b) even better, automatically sets
> application versions based on a source file (and tweaks them like git
> describe when there's been a commit since the version). This lets us
> continue to "alias" commits with a human readable version and doesn't
> require a single version across all applications. (Alternatively, we
> could wipe the version info on every project and set a single version
> that is the same for all applications that matches the CouchDB
> version, but this might get weird for upstream dependencies).
> 
> That said, I'd agree with Bob that the new dependency format seems to
> be solving a problem we shouldn't have. I'd rather just pull
> everything into a single repo and use tooling to help maintain any
> sharp edges like the versioning issue I mentioned above.
> 
> Personally, what I'd like to see is to have all Erlang repos merged
> into the main couchdb.git repo and then have all upstream dependencies
> managed by git-subtree. I could go either way on having the node and
> spidermonkey view engines included or not. For the non Erlang parts of
> our release (fauxton and documentation) I'd keep them as separate
> repos so that their tooling doesn't need to be changed and/or adapted
> to work out of a repo subdirectory. The administration things also
> seem to make good sense to keep separate as they're not part of the
> product/release tarball/whatever.
> 
> If anyone has a strong objection to a monolithic Erlang repo I'd like
> to hear it. Otherwise I may work up a lengthier and more thorough
> proposal for dev@ to consider consolidating all of these repositories
> for sanity and profit.
> 
> Paul
> 
> Main repo:
> 
> couchdb.git
> 
> 
> Erlang repos developed by ASF:
> 
> couchdb-b64url.git
> couchdb-cassim.git
> couchdb-chttpd.git
> couchdb-config.git
> couchdb-couch-collate.git
> couchdb-couch-dbupdates.git
> couchdb-couch-epi.git
> couchdb-couch-event.git
> couchdb-couch-httpd.git
> couchdb-couch-index.git
> couchdb-couch-log-lager.git
> couchdb-couch-log.git
> couchdb-couch-mrview.git
> couchdb-couch-plugins.git
> couchdb-couch-replicator.git
> couchdb-couch-stats.git
> couchdb-couch.git
> couchdb-ddoc-cache.git
> couchdb-erlang-tests.git
> couchdb-ets-lru.git
> couchdb-fabric.git
> couchdb-global-changes.git
> couchdb-ioq.git
> couchdb-khash.git
> couchdb-mango.git
> couchdb-mem3.git
> couchdb-peruser.git
> couchdb-rexi.git
> couchdb-setup.git
> couchdb-snappy.git
> couchdb-twig.git
> 
> 
> Non-Erlang things we develop as part of a release:
> 
> couchdb-fauxton.git
> couchdb-documentation.git
> 
> 
> Mirrored repos of upstream Erlang deps:
> 
> couchdb-bear.git
> couchdb-folsom.git
> couchdb-goldrush.git
> couchdb-ibrowse.git
> couchdb-jiffy.git
> couchdb-lager.git
> couchdb-meck.git
> couchdb-mochiweb.git
> couchdb-oauth.git
> couchdb-rebar.git
> 
> 
> Query Servers:
> 
> couchdb-query-server-node.git
> couchdb-query-server-spidermonkey.git
> 
> 
> Unsure but has Erlang in it:
> 
> couchdb-examples.git
> 
> 
> Project Administrative Things Kinda:
> 
> couchdb-admin.git
> couchdb-ci.git
> couchdb-docker.git
> couchdb-www.git
> 
> 
> Client Library:
> 
> couchdb-nano.git
> 
> 
> JS CLI tool:
> 
> couchdb-nmo.git
> 
> 
> Empty:
> 
> couchdb-javascript-tests.git
> 
> Legacy:
> 
> couchdb-futon.git
> couchdb-jquery-couch.git
> 
> 
>> On Wed, Apr 13, 2016 at 3:41 AM, Garren Smith <garren@apache.org> wrote:
>> I like the idea of going back to a single repo for core db features. I
>> would like Fauxton to still be in its own repo.
>> As someone who wrote some very basic erlang code for CouchDB recently. I
>> found the multiple repos quite tricky to manage and I couldn't see how it
>> made anything easier.
>> 
>>> On Wed, Apr 13, 2016 at 8:35 AM, Alexander Shorin <kxepal@gmail.com> wrote:
>>> 
>>> Hi Robert,
>>> 
>>> Point about flattening to a single repository is valid: in the end, we
>>> have our apps repos in broken state all the time as they are not
>>> declare their decencies. So noone can pick fabric@master and run it -
>>> he'll spend quite a lot of time to figure the deps of the right
>>> versions. But the idea to solve the problem by reducing set of
>>> repositories we have to test is good.
>>> 
>>> 
>>> Hi Iliya,
>>> 
>>> I have alternative solution for you:
>>> 
>>> - Turn-off Travis CI everywhere where we cannot be sure about testing
>>> without depended PRs (all except third-party modules, fauxton, docs,
>>> and few more independent projects like couch-epi);
>>> - Require everyone to submit additional PR to apache/couchdb repo with
>>> commit hashes update;
>>> - On this apache/couchdb PR we'll run CI testing;
>>> - If you rebase/update any of your subcomponent PRs you must update
>>> commit hash on apache/couchdb one;
>>> 
>>> Pros:
>>> - We won't forget to update rebar.config when new changes lands;
>>> - We will always run complete integration testing with all the right
>>> deps states;
>>> - We won't have to invent any complicated integration solutions to
>>> deal with sub-repos testing;
>>> - No new new steps/files/work introduced, so there is no need to care
>>> about learning curve;
>>> 
>>> Cons:
>>> - Need to be a bit tricky on Travis builder to realize on which remote
>>> (fork) new rebar.config hashes are to correctly checkout them, though
>>> that is not a rocket science since we have access to git information
>>> there.
>>> 
>>> The Jenkins CI role here is to ensure that we have master build right
>>> and releases build right, on the various OSes.
>>> 
>>> Sounds simpler and better for me, how it does for you?
>>> 
>>> --
>>> ,,,^..^,,,
>>> 
>>> 
>>> On Wed, Apr 13, 2016 at 12:37 AM, Robert Samuel Newson
>>> <rnewson@apache.org> wrote:
>>>> I'd like us to instead consider flattening to a single repository. I've
>>> found no value and only pain from the multiple repositories approach (43 in
>>> total!).
>>>> 
>>>> The contention is that multiple repositories enforces application
>>> boundaries (low coupling / high cohesion) but I've not felt that in
>>> reality. We don't, and couldn't meaningfully, release any of our components
>>> separately, and, as Ilya makes clear, many enhancements require changes to
>>> multiple repositories, and we break this into multiple commits, losing the
>>> ability to look at an enhancement in toto.
>>>> 
>>>> If what Ilya is proposing is the solution, I think it's the solution to
>>> a problem we should not have.
>>>> 
>>>> B.
>>>> 
>>>>> On 12 Apr 2016, at 16:22, Ilya Khlopotov <iilyak@ca.ibm.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> Dear community,
>>>>> 
>>>>> 
>>>>> There is a problem with contributors workflow which renders our CI
>>> system
>>>>> useless. As you might know couchdb project consists of multiple
>>>>> repositories. Most of the time changes cross the repositories
>>> boundaries.
>>>>> When this happens the push to any of the repositories causes CI
>>> failures.
>>>>> CI fails since it uses the old version of dependencies from main
>>> repository
>>>>> of the project. Here is what we can do about it.
>>>>> 
>>>>> # Proposal
>>>>> 
>>>>> Let's use multiple files for dependency management.
>>>>> 
>>>>> - deps.json - serves the same purpose as dependencies list from current
>>>>> rebar.config.script
>>>>> - proposed.deps.json - here we specify list of PRs we want to commit
>>>>> atomically
>>>>> - override.deps.json - local file outside of version control which we
>>>>> consult in order to include development tools specific to contributor
>>> (code
>>>>> reloader, debugger, tracer, profiler, binpp, ...)
>>>>> 
>>>>> Bellow is the example of a content of these files:
>>>>> 
>>>>> ## deps.json
>>>>> {
>>>>>   "src/b64url": [
>>>>>       "https://github.com/apache/couchdb-b64url",
>>>>>       "6895652d80f95cdf04efb14625abed868998f174"
>>>>>   ],
>>>>>   "src/cassim": [
>>>>>       "https://github.com/apache/couchdb-cassim",
>>>>>       "9bbfe82125284fa7cb3317079e8bc1dc876a07bf"
>>>>>   ],
>>>>>   "src/chttpd": [
>>>>>       "https://github.com/apache/couchdb-chttpd",
>>>>>       "54e8f6147486d9afc5245e0143d15a4dd1185654"
>>>>>   ],
>>>>>   "src/meck": [
>>>>>       "https://github.com/apache/couchdb-meck",
>>>>>       "tree/0.8.2"
>>>>>   ],
>>>>> ....
>>>>> }
>>>>> 
>>>>> ## proposed.deps.json
>>>>> {
>>>>>   "src/couch": "https://github.com/apache/couchdb-couch/pull/124",
>>>>>   "src/chttpd": "https://github.com/apache/couchdb-chttpd/pull/108"
>>>>>   "src/couch_tests": [
>>>>>       "https://github.com/apache/couchdb-erlang-tests",
>>>>>       "tree/branch"
>>>>>   ],
>>>>> }
>>>>> 
>>>>> # Interface
>>>>> 
>>>>> I propose to write a simple CLI tool to work with this structure.
>>> Bellow is
>>>>> a list of commands which we need to support (for minimal version)
>>>>> 
>>>>> ## Adding new dependency
>>>>> 
>>>>> git propose add https://github.com/apache/couchdb-foo
>>>>> a2d5ad2eedc960248b806f61df0a1009462bdb46
>>>>> git propose add https://github.com/apache/couchdb-bar tree/branch_name
>>>>> 
>>>>> ## Adding new PR to the change set
>>>>> 
>>>>> git propose add https://github.com/apache/couchdb-config/pull/4
>>>>> 
>>>>> ## Checking out right dependencies
>>>>> 
>>>>> git propose checkout
>>>>> 
>>>>> ## Checking out release
>>>>> 
>>>>> git propose checkout --release # this would ignore proposed.deps.json
>>> if it
>>>>> exists
>>>>> 
>>>>> ## Merge the change
>>>>> 
>>>>> This command would do the following:
>>>>> - Parse proposed.deps.json
>>>>> - Retrieve merge commit sha for every PR (exit if dependency is not
>>> merged
>>>>> yet)
>>>>> - Update dependencies in deps.json with correct merge commit sha
>>>>> - remove proposed.deps.json
>>>>> 
>>>>> # Workflow
>>>>> 
>>>>> export GIT_EXEC_PATH=`pwd`/bin # or use tools like `direnv`
>>>>> git checkout -b feature-ZZZ
>>>>> cd src/X && hack dependency X
>>>>> cd ../..
>>>>> cd src/Y && hack dependency Y
>>>>> issue PRs for X and Y
>>>>> cd ../..
>>>>> git propose add https://github.com/apache/couchdb-X/pull/4
>>>>> git propose add https://github.com/apache/couchdb-Y/pull/49
>>>>> git add proposed.deps.json
>>>>> git commit -m "Commit feature {something} which does {a thing} and can
>>> be
>>>>> tested as {procedure}"
>>>>> git push origin  feature-ZZZ
>>>>> ^ this would trigger our CI
>>>>> CI would do
>>>>> git propose checkout && ./configure && make check
>>>>> 
>>>>> # Pros and Cons
>>>>> 
>>>>> ## Pros
>>>>> 
>>>>> - Changes are merged atomically
>>>>> - CI runs against expected versions of deps
>>>>> - Enables git bisect
>>>>> - Reduce tasks that needs to be done by ASF committer (no need to update
>>>>> rebar.config.script manually)
>>>>> - Simplifies testing of PRs by reviewers
>>>>> - Simplifies rebar.config since rebar is not used for managing deps
>>>>> 
>>>>> ## Cons
>>>>> 
>>>>> - some github.com specifics (concept of PRs and access to github API
>>> to get
>>>>> info about PR)
>>>>> - we need to have github.com as one of the remotes
>>>>> - we trigger CI only on push to main repository
>>>>> 
>>>>> # Implementation
>>>>> 
>>>>> We write a git-propose script in python and place it in ./bin. We add
>>> ./bin
>>>>> into either GIT_EXEC_PATH or PATH. You always can call the script
>>> directly
>>>>> (as ./bin/git-propose) if you don't like amending
>>>>> your environment.
>>>>> 
>>>>> # Later improvements
>>>>> 
>>>>> - We can issue PRs from the tool itself
>>>>> - We can merge from the tool itself
>>>>> - We can implement support for multiple remotes (asf, github, private)
>>>>> - We can implement support for multiple git transports (for the first
>>>>> version all repositories in *.deps.json files would use https://)
>>>>> 
>>>>> Sincerely,
>>>>> ILYA KHLOPOTOV
>>> 


Mime
View raw message