couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Chesneau <bchesn...@gmail.com>
Subject Re: [DISCUSS] Multiple Repositories for Erlang Apps and Dependencies
Date Fri, 17 Jan 2014 05:22:56 GMT
thanks!
On Jan 17, 2014 12:01 AM, "Paul Davis" <paul.joseph.davis@gmail.com> wrote:

> New repos are up: https://git-wip-us.apache.org/repos/asf?s=couchdb
>
> I'm gonna go through and initialize them with history from master or
> one of the bigcouch and rcouch branches as appropriate.
>
> On Thu, Jan 16, 2014 at 2:12 PM, Paul Davis <paul.joseph.davis@gmail.com>
> wrote:
> > Infrastructure ticket opened:
> https://issues.apache.org/jira/browse/INFRA-7203
> >
> > On Thu, Jan 16, 2014 at 1:42 PM, Jan Lehnardt <jan@apache.org> wrote:
> >>
> >> On 16 Jan 2014, at 20:42 , Paul Davis <paul.joseph.davis@gmail.com>
> wrote:
> >>
> >>> It doesn't appear that this is objectionable to anyone. Does anyone
> >>> have an objection to us having infra/me create these repos to use for
> >>> the bigcouch/rcouch merge work? This won't affect master or releases
> >>> until those merges finish.
> >>
> >> no objections.
> >>
> >> Jan
> >> --
> >>
> >>>
> >>> On Tue, Jan 14, 2014 at 11:02 PM, Paul J Davis
> >>> <paul.joseph.davis@gmail.com> wrote:
> >>>>
> >>>>
> >>>>> On Jan 14, 2014, at 8:37 PM, Benoit Chesneau <bchesneau@gmail.com>
> wrote:
> >>>>>
> >>>>> On Wed, Jan 15, 2014 at 12:22 AM, Paul Davis <
> paul.joseph.davis@gmail.com>wrote:
> >>>>>
> >>>>>> I've recently been having discussions about how to handle the
> >>>>>> repository configuration for various bits of CouchDB post-merge.
The
> >>>>>> work that Benoit has been doing on the rcouch merge branch have
also
> >>>>>> touched on this topic as well.
> >>>>>>
> >>>>>> The background for those unfamiliar is that the standard operating
> >>>>>> procedure for Erlang is to have a single Erlang application
per
> >>>>>> repository and then rely on rebar to fetch each dependency.
> >>>>>> Traditionally in CouchDB land we've always just included the
source
> to
> >>>>>> all applications in a single monolithic repository and periodically
> >>>>>> reimport changes from upstream dependencies.
> >>>>>>
> >>>>>> Recently rcouch changed from the monolithic repository to use
> external
> >>>>>> repositories for some dependencies. Originally the BigCouch
used an
> >>>>>> even more federated scheme that had each Erlang application
in an
> >>>>>> external repository (and the core couch Erlang application was
in
> the
> >>>>>> root repository). When Bob Newson and I did the initial hacking
on
> the
> >>>>>> BigCouch merge we pulled those external dependencies into the
root
> >>>>>> repository reverting back to the large monolithic approach.
> >>>>>>
> >>>>>> After trying to deal with the merge and contemplating how various
> >>>>>> Erlang release things might work it's become fairly apparent
that
> the
> >>>>>> monolithic approach is a bit constrictive. For instance, part
of
> >>>>>> rebar's versioning abilities lets you tag repositories to generate
> >>>>>> versions rather than manually updating versions in source files.
> >>>>>> Another thing I've found on other projects is that having each
> >>>>>> application in a separate repository requires developers to
think a
> >>>>>> bit more detailed about the public internal interfaces used
through
> >>>>>> out the system. We've done some work to this extent already
with
> >>>>>> separating source directories but forcing commits to multiple
> >>>>>> repositories shoots up a big red flag that maybe there's a high
> level
> >>>>>> of coupling between two bits of code.
> >>>>>>
> >>>>>> Other benefits of having the multiple repository setup is that
its
> >>>>>> possible that this lends itself to being integrated with the
> proposed
> >>>>>> plugin system. It'd be fairly trivial to have a script that
went and
> >>>>>> fetched plugins that aren't developed at Apache (as a ./configure
> time
> >>>>>> switch type of thing). Having a system like this would also
allow us
> >>>>>> to have groups focused on particular bits of development not
have to
> >>>>>> concern themselves with the unrelated parts of the system.
> >>>>>>
> >>>>>> Given all that, I'd like to propose that we move to having a
> >>>>>> repository for each application/dependency that we use to build
> >>>>>> CouchDB. Each repository would be hosted on ASF infra and mirrored
> to
> >>>>>> GitHub as expected. This means that we could have the root
> repository
> >>>>>> be a simple repo that contains packaging/release/build stuff
that
> >>>>>> would enable lots of the ideas offered on configurable types
of
> >>>>>> release generation. I've included an initial list of repositories
at
> >>>>>> the end of this email. Its basically just the apps that have
been
> >>>>>> split out in either rcouch or bigcouch plus a few other bits
from
> >>>>>> CouchDB master.
> >>>>>>
> >>>>>> I would also point out that even though our main repo would
need to
> >>>>>> fetch other dependencies from the internet to build the final
> output,
> >>>>>> we fully intend that our release tarballs would *not* have this
> >>>>>> requirement. Ie, when we go to cut a release part of the process
the
> >>>>>> RM would run would be to pull all of those dependencies before
> >>>>>> creating a tarball that would be wholly self contained. Given
an
> >>>>>> apache-couchdb-x.y.z.tar.gz release file, there won't be a
> requirement
> >>>>>> to have access to the ASF git repos.
> >>>>>>
> >>>>>> I'm not entirely sure how controversial this is for anyone.
For the
> >>>>>> most part the reactions I remember hearing were more concerned
on
> >>>>>> whether the infrastructure team would allow us to use this sort
of
> >>>>>> configuration. I looked yesterday and asked and apparently its
> >>>>>> something we can request but as always we'll want to verify
again if
> >>>>>> we have consensus to move in this direction.
> >>>>>>
> >>>>>> Anyone have comments or flames? Right now I'm just interested
in
> >>>>>> feeling out what sort of (lack of?) consensus there is on such
a
> >>>>>> change. If there's general consensus I'd think we'd do a vote
in a
> >>>>>> couple weeks and if that passes then start on down this road
for the
> >>>>>> two merge projects and then it would become part of master once
> those
> >>>>>> land (as opposed to doing this to master and then attempting
to
> merge
> >>>>>> rcouch/bigcouch onto that somehow).
> >>>>>>
> >>>>>>
> >>>>>> This is a quick pass at listing what extra repositories I'd
have
> >>>>>> created. Some of these applications only exist in the bigcouch
> and/or
> >>>>>> rcouch branches so that's where the unfamiliar application names
are
> >>>>>> from. I'd also point out that the documentation and fauxton
things
> are
> >>>>>> just on a whim in that we could decouple that development from
the
> >>>>>> erlang development. I can see arguments for an against those.
I'm
> much
> >>>>>> less concerned on that aspect than the Erlang parts that are
> directly
> >>>>>> affected by rebar/Erlang conventions.
> >>>>>>
> >>>>>>   chttpd
> >>>>>>   config
> >>>>>>   couch
> >>>>>>   couch_collate
> >>>>>>   couch_dbupdates
> >>>>>>   couch_httpd
> >>>>>>   couch_index
> >>>>>>   couch_mrview
> >>>>>>   couch_plugins
> >>>>>>   couch_replicator
> >>>>>>   documentation
> >>>>>>   ddoc_cache
> >>>>>>   ets_lru
> >>>>>>   fabric
> >>>>>>   fauxton
> >>>>>>   ibrowse
> >>>>>>   jiffy
> >>>>>>   mem3
> >>>>>>   mochiweb
> >>>>>>   oauth
> >>>>>>   rebar
> >>>>>>   rexi
> >>>>>>   snappy
> >>>>>>   twig
> >>>>>
> >>>>>
> >>>>> I also contemplated this and and I am generally +1 on this. And
> definitely
> >>>>> +1 to mirror them on the apache git if possible.  I have a couple
of
> >>>>> comments though.
> >>>>>
> >>>>> Initially I also had everything separated in its own source
> repository. 1
> >>>>> year ago I merged back as one core repo the couchdb erlang
> applications and
> >>>>> put all the dependencies in the refuge repository or in the refuge
> CDN for
> >>>>> the spidermonkey and ICU sources.
> >>>>>
> >>>>> I merged back as one core repo the couchdb erlang applications
> because they
> >>>>> were a little too much dependant. Especially couch_httpd,
> couch_index and
> >>>>> couch_mrview. These applications are not yet enough by themselves.
> >>>>>
> >>>>> Imo if we split everything in  their own apps, then we should make
> sure
> >>>>> that couch_httpd can be used without couch_index and couch_mrview
> (which
> >>>>> means that "all_docs" is available in couch_httpd). Also we should
> be able
> >>>>> to just launch couch without any of the above. And probably without
> the
> >>>>> need of an ini. The couch_query_server module thing is an
> interesting case.
> >>>>> bigcouch is also introducing `ddoc_cache` which I am not sure why
it
> is
> >>>>> provided as a standalone app. Does it means it can be replaced by
> another
> >>>>> application eventually? Why not having it simply in the  couch
> application?
> >>>>> Does it needs to be updated separately?
> >>>>>
> >>>>> Also  all our base applications should also be named spaced
> correctly so
> >>>>> they will be strictly identified as erlang modules:  "config" is
too
> >>>>> generic, "ddoc_cache" too. Others are probably OK.
> >>>>>
> >>>>> There are probably other things that we could provide as apps:
> >>>>>
> >>>>> - couch_daemon,
> >>>>> - couch_js
> >>>>> - couch_external
> >>>>> - couch_stats
> >>>>> - couch_compaction_daemon
> >>>>> - couch_httpd_proxy
> >>>>>
> >>>>> Anyway again i'm +1 for this move, I really think it's a good idea.
> >>>>>
> >>>>> - benoit
> >>>>
> >>>> I agree on most of this. Roughly I see three general points.
> >>>>
> >>>> First, deciding on whether some things are external deps is
> definitely up for discussion. Whether couch_mrview is a different app/repo
> is not necessarily clear cut. Personally I think I over engineered
> couch_index which blurs the lines a bit. If I could wave a wand I'd have
> just couch_mrview and it'd be separate. More importantly I think the
> separate repos makes these things more apparent. The fact were discussing
> this sort of architecture thing is suggestive that it's forcing us to think
> a bit harder.
> >>>>
> >>>> Second is the aspect of composability. For instance the mrview thing
> to me is obviously a different repo precisely so a user could import couch
> (_core?) directly without requiring the spider monkey dependency. The
> monolithic repo doesn't allow this without some very non-standard tooling.
> >>>>
> >>>> Thirdly, app naming is always a contention. The config name was
> actually a hot code upgrade concern. We couldn't reuse couch_config
> directly at the time. And Adam was also hopeful we could the it into a
> useful non-specific config app.
> >>>>
> >>>> Fourthly, and related to secondly, we'll also want to look at
> splitting other apps out as necessary. The ones you listed I think aren't
> controversial it's just that no one has done it yet. My list was purely
> what existed so far without attempting to carve things up more. I
> definitely agree we should carve more in just wanted to cover consensus
> that carving is the right direction.
> >>>>
> >>>> Fifthly, I'm done typing on my phone. I'll fill in more thoughts
> tomorrow.
> >>>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message