couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Chesneau <bchesn...@gmail.com>
Subject Re: improve the bigcouch and rcouch merges
Date Wed, 22 Jan 2014 11:12:59 GMT
HI Paul,

Thanks for the clarifications.  I have some questions related to it:

- about the replicator_changes, do you mean it could right now replace the
current version we have without any more changes? (except those related to
twig).
- About fabric, and db name ("shards/.."), does it measn that fabric could
eventually be used for a standalone node?

fabric_rpc and couchc are quite similar but couchc was designed mostly as a
replacement of the non maintained hovercraft with the couchbeam API in mind.

About the priority, I agrree we don't need to have a really clean API at
first, it may takes a little time to have it. I proposed this priority so
we can continue to work in parallel on differents features which would be a
lot easier if we had a clean API. But we can work differently. For example
we can start by the following:

- integrate lager ( i will add it to the rcouch branch today)
- integrate couch_replicator
- make sure that the couch application in bigcouch has no cluster
dependencies (the ddoc loading for validate_update functions)

What do you think?

However I am thinking that the changes to the way the configuration is
handled, the AUTH and the way we pass the database hooks for validate
update functions, __replicator, _users on read and on update hooks should
be done ASAP. So we can have a clean and standalone couch application when
we release. I can work on this with Russel and I don't think the changes
are so big. I will create the tickets to track the changes asap.

- benoit







On Tue, Jan 21, 2014 at 12:39 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:

> Good points all around. Replying in line for context.
>
> On Tue, Jan 21, 2014 at 2:33 AM, Benoit Chesneau <bchesneau@gmail.com>
> wrote:
> > Hi all,
> >
> > I was reviewing the bigcouch and the rcouch branch this morning and I
> > think it is about time to start a real merge. Waiting that each merges
> > reach a working version before starting any work on the final product
> > is quite unproductive to say less.
> >
> > On the contrary what I see in the current branches let me think, it is a
> > good time to start some work to improve our code base in view of
> > achieving the 3 main goals we fixed a long time ago during the summit
> > and that were confirmed in December last year, ie.:
> >
> > - add a cluster facility to Apache CouchdDB
> > - allows Apache CouchDB to be embedded in other Erlang applications
> >   (just like mnesia the standard database library in Erlang)
> > - make Apache CouchDB more *OTP*ish
> >
> > The changes in bigcouch are indeed quite more monolithic than the rcouch
> > changes by adapting Apache Couchdb to be able to run in a cluster
> > environment. While the cluster management part is quite isolated (mem3,
> > rexi, chttpd), others parts of bigcouch are wrapping or modifying the
> > apache couchdb internals:
> >
> > - fabric is adding fabric_rpc and fabric_rpc2 wrapping some  modules and
> >   functions in the couchdb core to be able to call them on the cluster.
> > It should be noted that the cluster part is using fabric to do all calls
> > to each couchdb nodes on the cluster and return/merge the results.
>
> For clarification the duplication of these modules is a side effect of
> running hot code upgrades. Conceptually fabric_rpc2 is the same as
> fabric_rpc, we just use the second module to do RPC API upgrades on
> live clusters. Not something to worry about extensively.
>
> > - some changes has been done to optimise the core: removing ref
> >   counting, adding ets_lru to handle caching, db initialization has been
> > changed to look at the cluster to fetch ddocs to set validate funcs.
>
> The caching doesn't actually affect core. Its only used for clustered
> calls at the moment. The big thing here is definitely
> validate_doc_update functions being loaded lazily and the fact that
> single node code has to run clustered calls. I definitely agree that
> better internal APIs could probably clean this up.
>
> > - couch_replicator has been modified to work in a cluster
> > - bigcouch also changed the way the configuration is handled in couchdb,
> >   making it more evented. Which is better than the current version but
> > still rely on an INI file.
>
> We did change the API to enforce some basic patterns to avoid common
> pitfalls in hot code upgrades but the fundamental operation is still
> the same. Unfortunately, as you point out, this doesn't change our
> reliance on INI files.
>
> > - logs are now handled by twig a specific application created by
> >   Cloudant for that purpose.
> >
>
> We're looking to switch to lager as well. Twig was written pre-Lager
> and we haven't made the switch because it hasn't been a priority yet.
> We've got Lager queued up as a task to complete before the final
> merge.
>
> >
> > rcouch changes consist mostly in the following:
> >
> > - making Apache CouchDB an OTP release. It also extracts some code and
> >   revisits the supervision like  couch_httpd to transform it as a
> > standalone application, and make couch_replicator a full app
>
> For the record I removed all of the BigCouch stuff related to Erlang
> release handling because we started the merge when autotools was still
> a requirement. We do Erlang releases exclusively at Cloudant and
> switching to this is now going to be part of the merge. We do have
> some differences in some of the build tools but the rcouch build
> system isn't significantly different than our internal builds. We'll
> switch to something much more rcouch like before the merge.
>
> > - add some features as standalone Erlang apps
> > - adding view changes: which edit couch_mrview, couch_index,
> >   couch_replicator by adding new features without changing the current
> > one or internals
> > - changes the core by adding some optimizations and new features: remove
> >   ref counting, add new caching, add validate_doc_read. Actually
> > these optimizations have not been merge in view of using those in
> bigcouch.
> > The collation has been replaced by a nif available as a standalone
> > Erlang application couch_collate.
>
> Some of these I've skimmed and others I haven't spent any time looking
> at. The couch_collate change isn't a huge issue. I may suggest some
> optimizations in the future but the general idea is solid. The caching
> and validate_doc_read changes I'll have to review when I see them in
> commits. I didn't notice anything related on 1994-merge-rcouch when I
> reviewed it last week but I was mostly skimming for major things.
>
> > - A lot of changes have been done in the build process
>
> Quite a lot. We're both rebar centric so I'm not too concerned. I'd
> like to see a bit more thought on some of the ICU and SpiderMonkey
> aspects but nothing here is overly controversial I don't think. Maybe
> some UI things could use some polish.
>
> > - logs are handled by lager an application created by Basho commonly
> >   used these days in the Erlang world.
> >
>
> +1
>
> > So rcouch and bigcouch are conflicting on the following right now:
> >
> > - couch_replicator
>
> I'm not sure I've seen what the actual changes in rcouch are. I didn't
> look too hard, but for the most part Cloudant changes are bug fixes
> and making the _replicator db work in a cluster. I wouldn't think
> there'd be major conflicts here.
>
> > - couch_index, couch_mrview
>
> This isn't a conflict because we don't use these internally yet. Work
> to start using this is queued up to happen in the next few weeks.
>
> > - internals: some parts of the code. the collation, jiffy changes are
> >   not the same, ... .
>
> Collation and Jiffy are trivial changes. Jiffy I actually pulled out
> of BigCouch cause ejson was still a thing and I was trying to minimize
> the change. couch_collate looks mostly fine to me so I'm not overly
> concerned about that. The ref counting changes fou
>
> > - build process
> > - logging system
> >
> >
> > The goal of making couchdb embedabble in another Erlang application
> > is not achieved in bigcouch and still difficult with rcouch.
> >
> > - Difficult with rcouch because the configuration is actually strongly
> > based on settings passed via an ini file.
> > - Difficult with bigcouch  because of the internal changes implying the
> >   usage by default of a cluster (like loading ddocs at initialization).
> > Also it has the same problem with the ini files.
> > - Both handle authorization at the core level instead of handling
> >   it in a different layer mixing code between the internal api and the
> > HTTP API.
> >
> > If we are able to embed the couchdb core in any Erlang application, it
> > will considerably help us to merge bigcouch and rcouch quietly, easily,
> > with people working in parallel to achieve it. Building a plugin will be
> > a lot easier as well. The good thing is that both projects have the
> > roots to do it, it is a matter to merge some code and refactor some
> > internals:
> >
>
> I mostly agree with these comments. The reliance on INI files
> definitely prevents the ease of embedding the core storage engine.
>
> One thing, you misread the code for the default clustering a bit. The
> cluster assumption is only for database names that follow the pattern
> "shards/00000000-ffffffff/dbname.couch". Anything not following that
> pattern works just fine locally. Remember that we have the entire
> CouchDB single node test suite passing (minus a specific test or two
> for known reasons). We also use the single node APIs quite a bit for
> administration so its in our interest to make sure that continues to
> work.
>
> Definitely agree on the authorization issues. Russel Branca has
> recently been suggesting that we undo the core ties to user contexts
> for other reasons. I don't think this is an "if" so much as a "when"
> question.
>
> > - Have a clean internal API. Fabric the cluster library of bigcouch is
> >   adding 2  modules wrapping the couchdb API  (couch_db, couch_stream,
> > ...) `fabric_rpc` and `fabric_rpc2`.  couch_httpd* modules are also
> > wrapping it to send the result to the HTTP clients. Both are adding
> > a layer over the internals just to be able to use the core which is
> really
> > inefficient. It forces Erlang application that want to call directly
> > couchdb (like couchc [1]) to also wrap these internals to make them
> > useful. We should on the contrary offer a clean API at the core level
> > usable by fabric or any Erlang App. The API offered by the `fabric_rpc*`
> > or `couchc` should be our internals.
> >
>
> I'm not immediately familiar with couchc but fabric.erl mostly
> provides a clean API for everything needed for normal client access.
> There have been general discussions on writing a single node version
> of that module for exactly the same reasons you state. For instance,
> it'd be nice to have either chttpd or couch_httpd. Having both is just
> a waste. If we had two modules that would switch back and forth
> between single node vs. clustered would make that significantly
> easier.
>
> > So I propose as a first merge action to refactor the internal API so
> > it can be used directly by fabric and future Erlang applications
> > embedding or calling directly couchdb.
> >
>
> I disagree with this priority. We've already got enough going on and
> there's really not that much conflict (so far as I can tell) with
> internal APIs. Adding a "fix the thing we haven't fixed in years" step
> sounds like a recipe for disaster. This is definitely something we
> should do going forward I just don't think its a precondition for the
> merge work.
>
> > - Make the auth* apart - remove all `#user_ctx{}` calls from
> >   the couch_* modules- to handle the it in the transport or application
> > level. HTTP for now.By doing that we make the `couch` application
> > completely independent of the transport, so any erlang app can embed it
> > and propose its own API to enter the data. (just like mnesia). It will
> > also remove all the extra code we have to force the authorization by
> > faking the user context when it needs to use the internal api. The
> > auth* should be provided as an extension imo.
> >
>
> I agree and Russel has ideas for things quite like this. Again not
> something I think should be a precondition for the merge. (Not sure if
> you're proposing the precondition part here though).
>
> > - removing validate_* initialization in the core db level and let it
> >   to the transport or the application. For example it could be done when
> > opening the DB via the HTTP api by wrapping the open_db call. Some
> > applications may not need it at all. The core should still provide
> > hooks to do these actions, but the way the hooks are passed to the
> > database should be an application decision. couch_httpd is an
> > application exposing the couchdB API to other via HTTP.
> >
>
> I'll have to look closer at validate_doc_read to get an idea on this.
> At first I also think this is also in the "good idea but not a merge
> requirement" but that may change when reading the validate_doc_read
> implementation. We specifically rely on not requiring the functions
> for reads so I'll have to think harder on this.
>
> > - adding special databases hooks only on the application level. Right
> >   now we have specific databases handled in the core code (`_users` and
> > and `_replicator`). We should instead pass this hook on another level,
> > like when we initialize the couch_replicator application for example.
> >
>
> I'm not sure what you mean here. I 100% agree that we should aim to
> remove the special logic for these databases in core. I'd prefer to
> see their "database likenesses" removed entirely but that's a
> different story for a different thread.
>
> > - making the configuration created by an INI file optional. We could have
> >   a default configuration set via the API that can be feed by later by
> > the ini file or any other config system.
> >
>
> I definitely agree that there's config improvements to be made. One of
> the things I'd like to see is defaults in code along with the
> possibility for enforcing type constraints (ie, reject updates that
> change an integer to a string or vice versa).
>
> > - merging any optimizations at the same time (ref counting replacement).
> >
>
> These sorts of things I think we'll just have to look at on a
> case-by-case basis.  The fact that both rcouch and BigCouch needed to
> rewrite these bits shows that its an issue. Its just a matter of
> making sure that the merge supports all the required use cases.
>
> >
> > Imo  if we do the work above it will allows us to speed the merge of
> > both rcouch and bigcouch. The second step would be having the fabric api
> > to use this clean api, and modifying couch_htpd and couch_replicator to
> > use it. Bonus point, it would ease unitesting by isolating the cluster
> > part from the core.
> >
>
> While there are definitely some warts on the clustering APIs I would
> argue that 95% of the clustering code is isolated from core. There's
> definitely a lot more logic in fabric for clustering than the few
> places we need to call fabric from core. Though I totally agree that
> having 100% of core not care about fabric would be ideal.
>
> On the other hand, I disagree that any of these as a precondition for
> merging would speed the merge. They're all things we should be working
> towards and definitely things we should be thinking about while
> working on the merge, but nothing in here strikes me as a road block
> to merging. They help us clean up lots of things post-merge like
> having two extremely close HTTP layers but while that duplication
> stinks to high heavens, its worked well enough for 6+ years of
> production deployments.
>
> > I probably forgot some features that could also be done during the
> > merge, like merging couch_index and couch_mrview in one app or at least
> > having `_all_docs` handled by couch_httpd but these changes can be done
> > at another level imo. Also the `_all_dbs` is handled differently, in
> > bigcouh it is handled by recording all the dbs in a database, in couchdb
> > we are right now only relying only on the fs. The bigcouch solution
> > should be the default imo.
> >
>
> These are also good things to consider. I'm definitely +1 on the
> couch_index/couch_mrview refactoring as well as moving _all_docs
> around. That said its definitely not a priority for the merge. We have
> two major forks to merge back into mainline. I'd prefer we focus on
> the minimal amount of effort required to make that happen so that we
> can then focus on some of these other bigger things.
>
> RIght now I think the focus should be exclusively on getting the three
> code bases merged back to master in a fashion that allows us to start
> iterating on all of the things you've listed. That means taking the IP
> cleared code we have and doing the legwork to get it all onto master
> in such a fashion that we can continue making progress. Once that's
> done I think we can start moving towards making changes to simplify
> internals and speed development.
>
> Or to phrase that slightly differently, I think we should be focusing
> on minimizing the amount of effort needed to make the merge work. I
> think all of your proposals here are solid and something we should
> pursue but I think they would make the merge more complicated rather
> than simpler.
>
> >
> > Anyway, I wish we can start this work ASAP rather than just working on 2
> > branches in parallel.
> >
>
> I definitely agree. I think the multi-repo approach will help us focus
> on what's actually in conflict and then we can move towards resolving
> those changes. I know the code bases look quite different at the
> moment but from all that I've read so far, these are mostly
> superficial issues.
>
> For instance, the rcouch build system is significantly different than
> the bigcouch version but that's due to me wiring the bigcouch build
> into autotools. Here we'll mostly default towards the rcouch system.
> We'll have to iron out some details but its nothing insurmountable.
> The index work in rcouch will mostly be pulled directly into bigcouch.
> At the merge we may not have 100% support for the newer features but
> that just means we may cut another single node release before we cut
> one that has API parity between clustered and single node.
>
> > Any feedback is welcome,
> >
> > - benoit
> >
> > [1] http://github.com/benoitc/couchc
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message