couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Garren Smith <>
Subject Re: [DISCUSS] Multiple Repositories for Erlang Apps and Dependencies
Date Fri, 17 Jan 2014 12:00:13 GMT
I'm claiming 2nd person added!

On 17 Jan 2014, at 1:28 PM, Noah Slater <> wrote:

> Psst. A little birdy tells me that if you ask nicely, the infra folks
> will add you to the Apache GitHub org too, so you can show off your
> Apache affiliation. I was the first person added. Because I may have
> been the first to ask. ;)
> On 17 January 2014 11:56, Noah Slater <> wrote:
>> Awesome, thanks Paul.
>> Note to all devs: if you want your contributions to CouchDB to show up
>> on your GitHub profile, you have to star each of the repositories.
>> (That's just how GitHub mechanics work for repo mirrors.)
>> You can find them all here:
>> On 17 January 2014 00:00, Paul Davis <> wrote:
>>> New repos are up:
>>> I'm gonna go through and initialize them with history from master or
>>> one of the bigcouch and rcouch branches as appropriate.
>>> On Thu, Jan 16, 2014 at 2:12 PM, Paul Davis <>
>>>> Infrastructure ticket opened:
>>>> On Thu, Jan 16, 2014 at 1:42 PM, Jan Lehnardt <> wrote:
>>>>> On 16 Jan 2014, at 20:42 , Paul Davis <>
>>>>>> It doesn't appear that this is objectionable to anyone. Does anyone
>>>>>> have an objection to us having infra/me create these repos to use
>>>>>> the bigcouch/rcouch merge work? This won't affect master or releases
>>>>>> until those merges finish.
>>>>> no objections.
>>>>> Jan
>>>>> --
>>>>>> On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis
>>>>>> <> wrote:
>>>>>>>> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <>
>>>>>>>> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis <>wrote:
>>>>>>>>> I've recently been having discussions about how to handle
>>>>>>>>> repository configuration for various bits of CouchDB
post-merge. The
>>>>>>>>> work that Benoit has been doing on the rcouch merge branch
have also
>>>>>>>>> touched on this topic as well.
>>>>>>>>> The background for those unfamiliar is that the standard
>>>>>>>>> procedure for Erlang is to have a single Erlang application
>>>>>>>>> repository and then rely on rebar to fetch each dependency.
>>>>>>>>> Traditionally in CouchDB land we've always just included
the source to
>>>>>>>>> all applications in a single monolithic repository and
>>>>>>>>> reimport changes from upstream dependencies.
>>>>>>>>> Recently rcouch changed from the monolithic repository
to use external
>>>>>>>>> repositories for some dependencies. Originally the BigCouch
used an
>>>>>>>>> even more federated scheme that had each Erlang application
in an
>>>>>>>>> external repository (and the core couch Erlang application
was in the
>>>>>>>>> root repository). When Bob Newson and I did the initial
hacking on the
>>>>>>>>> BigCouch merge we pulled those external dependencies
into the root
>>>>>>>>> repository reverting back to the large monolithic approach.
>>>>>>>>> After trying to deal with the merge and contemplating
how various
>>>>>>>>> Erlang release things might work it's become fairly apparent
that the
>>>>>>>>> monolithic approach is a bit constrictive. For instance,
part of
>>>>>>>>> rebar's versioning abilities lets you tag repositories
to generate
>>>>>>>>> versions rather than manually updating versions in source
>>>>>>>>> Another thing I've found on other projects is that having
>>>>>>>>> application in a separate repository requires developers
to think a
>>>>>>>>> bit more detailed about the public internal interfaces
used through
>>>>>>>>> out the system. We've done some work to this extent already
>>>>>>>>> separating source directories but forcing commits to
>>>>>>>>> repositories shoots up a big red flag that maybe there's
a high level
>>>>>>>>> of coupling between two bits of code.
>>>>>>>>> Other benefits of having the multiple repository setup
is that its
>>>>>>>>> possible that this lends itself to being integrated with
the proposed
>>>>>>>>> plugin system. It'd be fairly trivial to have a script
that went and
>>>>>>>>> fetched plugins that aren't developed at Apache (as a
./configure time
>>>>>>>>> switch type of thing). Having a system like this would
also allow us
>>>>>>>>> to have groups focused on particular bits of development
not have to
>>>>>>>>> concern themselves with the unrelated parts of the system.
>>>>>>>>> Given all that, I'd like to propose that we move to having
>>>>>>>>> repository for each application/dependency that we use
to build
>>>>>>>>> CouchDB. Each repository would be hosted on ASF infra
and mirrored to
>>>>>>>>> GitHub as expected. This means that we could have the
root repository
>>>>>>>>> be a simple repo that contains packaging/release/build
stuff that
>>>>>>>>> would enable lots of the ideas offered on configurable
types of
>>>>>>>>> release generation. I've included an initial list of
repositories at
>>>>>>>>> the end of this email. Its basically just the apps that
have been
>>>>>>>>> split out in either rcouch or bigcouch plus a few other
bits from
>>>>>>>>> CouchDB master.
>>>>>>>>> I would also point out that even though our main repo
would need to
>>>>>>>>> fetch other dependencies from the internet to build the
final output,
>>>>>>>>> we fully intend that our release tarballs would *not*
have this
>>>>>>>>> requirement. Ie, when we go to cut a release part of
the process the
>>>>>>>>> RM would run would be to pull all of those dependencies
>>>>>>>>> creating a tarball that would be wholly self contained.
Given an
>>>>>>>>> apache-couchdb-x.y.z.tar.gz release file, there won't
be a requirement
>>>>>>>>> to have access to the ASF git repos.
>>>>>>>>> I'm not entirely sure how controversial this is for anyone.
For the
>>>>>>>>> most part the reactions I remember hearing were more
concerned on
>>>>>>>>> whether the infrastructure team would allow us to use
this sort of
>>>>>>>>> configuration. I looked yesterday and asked and apparently
>>>>>>>>> something we can request but as always we'll want to
verify again if
>>>>>>>>> we have consensus to move in this direction.
>>>>>>>>> Anyone have comments or flames? Right now I'm just interested
>>>>>>>>> feeling out what sort of (lack of?) consensus there is
on such a
>>>>>>>>> change. If there's general consensus I'd think we'd do
a vote in a
>>>>>>>>> couple weeks and if that passes then start on down this
road for the
>>>>>>>>> two merge projects and then it would become part of master
once those
>>>>>>>>> land (as opposed to doing this to master and then attempting
to merge
>>>>>>>>> rcouch/bigcouch onto that somehow).
>>>>>>>>> This is a quick pass at listing what extra repositories
I'd have
>>>>>>>>> created. Some of these applications only exist in the
bigcouch and/or
>>>>>>>>> rcouch branches so that's where the unfamiliar application
names are
>>>>>>>>> from. I'd also point out that the documentation and fauxton
things are
>>>>>>>>> just on a whim in that we could decouple that development
from the
>>>>>>>>> erlang development. I can see arguments for an against
those. I'm much
>>>>>>>>> less concerned on that aspect than the Erlang parts that
are directly
>>>>>>>>> affected by rebar/Erlang conventions.
>>>>>>>>>  chttpd
>>>>>>>>>  config
>>>>>>>>>  couch
>>>>>>>>>  couch_collate
>>>>>>>>>  couch_dbupdates
>>>>>>>>>  couch_httpd
>>>>>>>>>  couch_index
>>>>>>>>>  couch_mrview
>>>>>>>>>  couch_plugins
>>>>>>>>>  couch_replicator
>>>>>>>>>  documentation
>>>>>>>>>  ddoc_cache
>>>>>>>>>  ets_lru
>>>>>>>>>  fabric
>>>>>>>>>  fauxton
>>>>>>>>>  ibrowse
>>>>>>>>>  jiffy
>>>>>>>>>  mem3
>>>>>>>>>  mochiweb
>>>>>>>>>  oauth
>>>>>>>>>  rebar
>>>>>>>>>  rexi
>>>>>>>>>  snappy
>>>>>>>>>  twig
>>>>>>>> I also contemplated this and and I am generally +1 on this.
And definitely
>>>>>>>> +1 to mirror them on the apache git if possible.  I have
a couple of
>>>>>>>> comments though.
>>>>>>>> Initially I also had everything separated in its own source
repository. 1
>>>>>>>> year ago I merged back as one core repo the couchdb erlang
applications and
>>>>>>>> put all the dependencies in the refuge repository or in the
refuge CDN for
>>>>>>>> the spidermonkey and ICU sources.
>>>>>>>> I merged back as one core repo the couchdb erlang applications
because they
>>>>>>>> were a little too much dependant. Especially couch_httpd,
couch_index and
>>>>>>>> couch_mrview. These applications are not yet enough by themselves.
>>>>>>>> Imo if we split everything in  their own apps, then we should
make sure
>>>>>>>> that couch_httpd can be used without couch_index and couch_mrview
>>>>>>>> means that "all_docs" is available in couch_httpd). Also
we should be able
>>>>>>>> to just launch couch without any of the above. And probably
without the
>>>>>>>> need of an ini. The couch_query_server module thing is an
interesting case.
>>>>>>>> bigcouch is also introducing `ddoc_cache` which I am not
sure why it is
>>>>>>>> provided as a standalone app. Does it means it can be replaced
by another
>>>>>>>> application eventually? Why not having it simply in the 
couch application?
>>>>>>>> Does it needs to be updated separately?
>>>>>>>> Also  all our base applications should also be named spaced
correctly so
>>>>>>>> they will be strictly identified as erlang modules:  "config"
is too
>>>>>>>> generic, "ddoc_cache" too. Others are probably OK.
>>>>>>>> There are probably other things that we could provide as
>>>>>>>> - couch_daemon,
>>>>>>>> - couch_js
>>>>>>>> - couch_external
>>>>>>>> - couch_stats
>>>>>>>> - couch_compaction_daemon
>>>>>>>> - couch_httpd_proxy
>>>>>>>> Anyway again i'm +1 for this move, I really think it's a
good idea.
>>>>>>>> - benoit
>>>>>>> I agree on most of this. Roughly I see three general points.
>>>>>>> First, deciding on whether some things are external deps is definitely
up for discussion. Whether couch_mrview is a different app/repo is not necessarily clear cut.
Personally I think I over engineered couch_index which blurs the lines a bit. If I could wave
a wand I'd have just couch_mrview and it'd be separate. More importantly I think the separate
repos makes these things more apparent. The fact were discussing this sort of architecture
thing is suggestive that it's forcing us to think a bit harder.
>>>>>>> Second is the aspect of composability. For instance the mrview
thing to me is obviously a different repo precisely so a user could import couch (_core?)
directly without requiring the spider monkey dependency. The monolithic repo doesn't allow
this without some very non-standard tooling.
>>>>>>> Thirdly, app naming is always a contention. The config name was
actually a hot code upgrade concern. We couldn't reuse couch_config directly at the time.
And Adam was also hopeful we could the it into a useful non-specific config app.
>>>>>>> Fourthly, and related to secondly, we'll also want to look at
splitting other apps out as necessary. The ones you listed I think aren't controversial it's
just that no one has done it yet. My list was purely what existed so far without attempting
to carve things up more. I definitely agree we should carve more in just wanted to cover consensus
that carving is the right direction.
>>>>>>> Fifthly, I'm done typing on my phone. I'll fill in more thoughts
>> --
>> Noah Slater
> -- 
> Noah Slater

View raw message