couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noah Slater <nsla...@apache.org>
Subject Re: [DISCUSS] Multiple Repositories for Erlang Apps and Dependencies
Date Fri, 17 Jan 2014 10:56:33 GMT
Awesome, thanks Paul.

Note to all devs: if you want your contributions to CouchDB to show up
on your GitHub profile, you have to star each of the repositories.
(That's just how GitHub mechanics work for repo mirrors.)

You can find them all here:

https://github.com/apache

On 17 January 2014 00:00, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> New repos are up: https://git-wip-us.apache.org/repos/asf?s=couchdb
>
> I'm gonna go through and initialize them with history from master or
> one of the bigcouch and rcouch branches as appropriate.
>
> On Thu, Jan 16, 2014 at 2:12 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> Infrastructure ticket opened: https://issues.apache.org/jira/browse/INFRA-7203
>>
>> On Thu, Jan 16, 2014 at 1:42 PM, Jan Lehnardt <jan@apache.org> wrote:
>>>
>>> On 16 Jan 2014, at 20:42 , Paul Davis <paul.joseph.davis@gmail.com> wrote:
>>>
>>>> It doesn't appear that this is objectionable to anyone. Does anyone
>>>> have an objection to us having infra/me create these repos to use for
>>>> the bigcouch/rcouch merge work? This won't affect master or releases
>>>> until those merges finish.
>>>
>>> no objections.
>>>
>>> Jan
>>> --
>>>
>>>>
>>>> On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis
>>>> <paul.joseph.davis@gmail.com> wrote:
>>>>>
>>>>>
>>>>>> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <bchesneau@gmail.com>
wrote:
>>>>>>
>>>>>> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>>>>>>
>>>>>>> I've recently been having discussions about how to handle the
>>>>>>> repository configuration for various bits of CouchDB post-merge.
The
>>>>>>> work that Benoit has been doing on the rcouch merge branch have
also
>>>>>>> touched on this topic as well.
>>>>>>>
>>>>>>> The background for those unfamiliar is that the standard operating
>>>>>>> procedure for Erlang is to have a single Erlang application per
>>>>>>> repository and then rely on rebar to fetch each dependency.
>>>>>>> Traditionally in CouchDB land we've always just included the
source to
>>>>>>> all applications in a single monolithic repository and periodically
>>>>>>> reimport changes from upstream dependencies.
>>>>>>>
>>>>>>> Recently rcouch changed from the monolithic repository to use
external
>>>>>>> repositories for some dependencies. Originally the BigCouch used
an
>>>>>>> even more federated scheme that had each Erlang application in
an
>>>>>>> external repository (and the core couch Erlang application was
in the
>>>>>>> root repository). When Bob Newson and I did the initial hacking
on the
>>>>>>> BigCouch merge we pulled those external dependencies into the
root
>>>>>>> repository reverting back to the large monolithic approach.
>>>>>>>
>>>>>>> After trying to deal with the merge and contemplating how various
>>>>>>> Erlang release things might work it's become fairly apparent
that the
>>>>>>> monolithic approach is a bit constrictive. For instance, part
of
>>>>>>> rebar's versioning abilities lets you tag repositories to generate
>>>>>>> versions rather than manually updating versions in source files.
>>>>>>> Another thing I've found on other projects is that having each
>>>>>>> application in a separate repository requires developers to think
a
>>>>>>> bit more detailed about the public internal interfaces used through
>>>>>>> out the system. We've done some work to this extent already with
>>>>>>> separating source directories but forcing commits to multiple
>>>>>>> repositories shoots up a big red flag that maybe there's a high
level
>>>>>>> of coupling between two bits of code.
>>>>>>>
>>>>>>> Other benefits of having the multiple repository setup is that
its
>>>>>>> possible that this lends itself to being integrated with the
proposed
>>>>>>> plugin system. It'd be fairly trivial to have a script that went
and
>>>>>>> fetched plugins that aren't developed at Apache (as a ./configure
time
>>>>>>> switch type of thing). Having a system like this would also allow
us
>>>>>>> to have groups focused on particular bits of development not
have to
>>>>>>> concern themselves with the unrelated parts of the system.
>>>>>>>
>>>>>>> Given all that, I'd like to propose that we move to having a
>>>>>>> repository for each application/dependency that we use to build
>>>>>>> CouchDB. Each repository would be hosted on ASF infra and mirrored
to
>>>>>>> GitHub as expected. This means that we could have the root repository
>>>>>>> be a simple repo that contains packaging/release/build stuff
that
>>>>>>> would enable lots of the ideas offered on configurable types
of
>>>>>>> release generation. I've included an initial list of repositories
at
>>>>>>> the end of this email. Its basically just the apps that have
been
>>>>>>> split out in either rcouch or bigcouch plus a few other bits
from
>>>>>>> CouchDB master.
>>>>>>>
>>>>>>> I would also point out that even though our main repo would need
to
>>>>>>> fetch other dependencies from the internet to build the final
output,
>>>>>>> we fully intend that our release tarballs would *not* have this
>>>>>>> requirement. Ie, when we go to cut a release part of the process
the
>>>>>>> RM would run would be to pull all of those dependencies before
>>>>>>> creating a tarball that would be wholly self contained. Given
an
>>>>>>> apache-couchdb-x.y.z.tar.gz release file, there won't be a requirement
>>>>>>> to have access to the ASF git repos.
>>>>>>>
>>>>>>> I'm not entirely sure how controversial this is for anyone. For
the
>>>>>>> most part the reactions I remember hearing were more concerned
on
>>>>>>> whether the infrastructure team would allow us to use this sort
of
>>>>>>> configuration. I looked yesterday and asked and apparently its
>>>>>>> something we can request but as always we'll want to verify again
if
>>>>>>> we have consensus to move in this direction.
>>>>>>>
>>>>>>> Anyone have comments or flames? Right now I'm just interested
in
>>>>>>> feeling out what sort of (lack of?) consensus there is on such
a
>>>>>>> change. If there's general consensus I'd think we'd do a vote
in a
>>>>>>> couple weeks and if that passes then start on down this road
for the
>>>>>>> two merge projects and then it would become part of master once
those
>>>>>>> land (as opposed to doing this to master and then attempting
to merge
>>>>>>> rcouch/bigcouch onto that somehow).
>>>>>>>
>>>>>>>
>>>>>>> This is a quick pass at listing what extra repositories I'd have
>>>>>>> created. Some of these applications only exist in the bigcouch
and/or
>>>>>>> rcouch branches so that's where the unfamiliar application names
are
>>>>>>> from. I'd also point out that the documentation and fauxton things
are
>>>>>>> just on a whim in that we could decouple that development from
the
>>>>>>> erlang development. I can see arguments for an against those.
I'm much
>>>>>>> less concerned on that aspect than the Erlang parts that are
directly
>>>>>>> affected by rebar/Erlang conventions.
>>>>>>>
>>>>>>>   chttpd
>>>>>>>   config
>>>>>>>   couch
>>>>>>>   couch_collate
>>>>>>>   couch_dbupdates
>>>>>>>   couch_httpd
>>>>>>>   couch_index
>>>>>>>   couch_mrview
>>>>>>>   couch_plugins
>>>>>>>   couch_replicator
>>>>>>>   documentation
>>>>>>>   ddoc_cache
>>>>>>>   ets_lru
>>>>>>>   fabric
>>>>>>>   fauxton
>>>>>>>   ibrowse
>>>>>>>   jiffy
>>>>>>>   mem3
>>>>>>>   mochiweb
>>>>>>>   oauth
>>>>>>>   rebar
>>>>>>>   rexi
>>>>>>>   snappy
>>>>>>>   twig
>>>>>>
>>>>>>
>>>>>> I also contemplated this and and I am generally +1 on this. And definitely
>>>>>> +1 to mirror them on the apache git if possible.  I have a couple
of
>>>>>> comments though.
>>>>>>
>>>>>> Initially I also had everything separated in its own source repository.
1
>>>>>> year ago I merged back as one core repo the couchdb erlang applications
and
>>>>>> put all the dependencies in the refuge repository or in the refuge
CDN for
>>>>>> the spidermonkey and ICU sources.
>>>>>>
>>>>>> I merged back as one core repo the couchdb erlang applications because
they
>>>>>> were a little too much dependant. Especially couch_httpd, couch_index
and
>>>>>> couch_mrview. These applications are not yet enough by themselves.
>>>>>>
>>>>>> Imo if we split everything in  their own apps, then we should make
sure
>>>>>> that couch_httpd can be used without couch_index and couch_mrview
(which
>>>>>> means that "all_docs" is available in couch_httpd). Also we should
be able
>>>>>> to just launch couch without any of the above. And probably without
the
>>>>>> need of an ini. The couch_query_server module thing is an interesting
case.
>>>>>> bigcouch is also introducing `ddoc_cache` which I am not sure why
it is
>>>>>> provided as a standalone app. Does it means it can be replaced by
another
>>>>>> application eventually? Why not having it simply in the  couch application?
>>>>>> Does it needs to be updated separately?
>>>>>>
>>>>>> Also  all our base applications should also be named spaced correctly
so
>>>>>> they will be strictly identified as erlang modules:  "config" is
too
>>>>>> generic, "ddoc_cache" too. Others are probably OK.
>>>>>>
>>>>>> There are probably other things that we could provide as apps:
>>>>>>
>>>>>> - couch_daemon,
>>>>>> - couch_js
>>>>>> - couch_external
>>>>>> - couch_stats
>>>>>> - couch_compaction_daemon
>>>>>> - couch_httpd_proxy
>>>>>>
>>>>>> Anyway again i'm +1 for this move, I really think it's a good idea.
>>>>>>
>>>>>> - benoit
>>>>>
>>>>> I agree on most of this. Roughly I see three general points.
>>>>>
>>>>> First, deciding on whether some things are external deps is definitely
up for discussion. Whether couch_mrview is a different app/repo is not necessarily clear cut.
Personally I think I over engineered couch_index which blurs the lines a bit. If I could wave
a wand I'd have just couch_mrview and it'd be separate. More importantly I think the separate
repos makes these things more apparent. The fact were discussing this sort of architecture
thing is suggestive that it's forcing us to think a bit harder.
>>>>>
>>>>> Second is the aspect of composability. For instance the mrview thing
to me is obviously a different repo precisely so a user could import couch (_core?) directly
without requiring the spider monkey dependency. The monolithic repo doesn't allow this without
some very non-standard tooling.
>>>>>
>>>>> Thirdly, app naming is always a contention. The config name was actually
a hot code upgrade concern. We couldn't reuse couch_config directly at the time. And Adam
was also hopeful we could the it into a useful non-specific config app.
>>>>>
>>>>> Fourthly, and related to secondly, we'll also want to look at splitting
other apps out as necessary. The ones you listed I think aren't controversial it's just that
no one has done it yet. My list was purely what existed so far without attempting to carve
things up more. I definitely agree we should carve more in just wanted to cover consensus
that carving is the right direction.
>>>>>
>>>>> Fifthly, I'm done typing on my phone. I'll fill in more thoughts tomorrow.
>>>>>
>>>



-- 
Noah Slater
https://twitter.com/nslater

Mime
View raw message