couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: improve the bigcouch and rcouch merges
Date Wed, 22 Jan 2014 02:55:20 GMT
Agree, great discussion here.  I will say that I think the INI configuration system is quite
nice, and that I'd love to see couch_config / config gain wider adoption as an alternative
to using OTP application environments (of course we should still make the INI file itself
optional by embedding the defaults in the code).  I'd also say that the fabric.erl interface
has a number of warts that I'd love to clean up sometime.

In terms of priorities, I agree with Paul's summary:

> I think we should be focusing on minimizing the amount of effort needed to make the merge
work. I think all of your proposals here are solid and something we should pursue but I think
they would make the merge more complicated rather than simpler.

The sooner we can get some of these repos into master the sooner we can establish well-scoped
topic branches for Benoit's great recommendations.

Adam

On Jan 21, 2014, at 3:07 PM, Russell Branca <chewbranca@apache.org> wrote:

> Lot of great ideas in this thread. A number of the suggested items are
> feature and refactoring work, so I agree with Paul that we should separate
> those from the merges themselves, and concentrate on them after we've got
> the base pieces merged.
> 
> Fabric works well for an internal API on the clustered bits. It could use a
> little work to make it more isolated from chttpd so that it's more directly
> usable. I think it makes sense to discuss making a more uniform API that
> provides similar functionality for clustered vs non clustered operations,
> but this is outside the scope of the merge.
> 
> As Paul mentioned, I've got a number of ideas for ripping out auth into an
> isolated application that handles #user_ctx, roles, and user CRUD, which I
> want to dive into more after the merges.
> 
> 
> -Russell
> 
> 
> On Tue, Jan 21, 2014 at 5:41 AM, Robert Samuel Newson <rnewson@apache.org>wrote:
> 
>> 
>> To follow on from Paul’s responses, we will be deleting fabric_rpc2 before
>> the merge. That exists only temporarily at Cloudant to ease upgrades, as
>> Paul notes, it has no place in the final source tree.
>> 
>> B.
>> 
>> On 21 Jan 2014, at 11:39, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> 
>>> Good points all around. Replying in line for context.
>>> 
>>> On Tue, Jan 21, 2014 at 2:33 AM, Benoit Chesneau <bchesneau@gmail.com>
>> wrote:
>>>> Hi all,
>>>> 
>>>> I was reviewing the bigcouch and the rcouch branch this morning and I
>>>> think it is about time to start a real merge. Waiting that each merges
>>>> reach a working version before starting any work on the final product
>>>> is quite unproductive to say less.
>>>> 
>>>> On the contrary what I see in the current branches let me think, it is a
>>>> good time to start some work to improve our code base in view of
>>>> achieving the 3 main goals we fixed a long time ago during the summit
>>>> and that were confirmed in December last year, ie.:
>>>> 
>>>> - add a cluster facility to Apache CouchdDB
>>>> - allows Apache CouchDB to be embedded in other Erlang applications
>>>> (just like mnesia the standard database library in Erlang)
>>>> - make Apache CouchDB more *OTP*ish
>>>> 
>>>> The changes in bigcouch are indeed quite more monolithic than the rcouch
>>>> changes by adapting Apache Couchdb to be able to run in a cluster
>>>> environment. While the cluster management part is quite isolated (mem3,
>>>> rexi, chttpd), others parts of bigcouch are wrapping or modifying the
>>>> apache couchdb internals:
>>>> 
>>>> - fabric is adding fabric_rpc and fabric_rpc2 wrapping some  modules and
>>>> functions in the couchdb core to be able to call them on the cluster.
>>>> It should be noted that the cluster part is using fabric to do all calls
>>>> to each couchdb nodes on the cluster and return/merge the results.
>>> 
>>> For clarification the duplication of these modules is a side effect of
>>> running hot code upgrades. Conceptually fabric_rpc2 is the same as
>>> fabric_rpc, we just use the second module to do RPC API upgrades on
>>> live clusters. Not something to worry about extensively.
>>> 
>>>> - some changes has been done to optimise the core: removing ref
>>>> counting, adding ets_lru to handle caching, db initialization has been
>>>> changed to look at the cluster to fetch ddocs to set validate funcs.
>>> 
>>> The caching doesn't actually affect core. Its only used for clustered
>>> calls at the moment. The big thing here is definitely
>>> validate_doc_update functions being loaded lazily and the fact that
>>> single node code has to run clustered calls. I definitely agree that
>>> better internal APIs could probably clean this up.
>>> 
>>>> - couch_replicator has been modified to work in a cluster
>>>> - bigcouch also changed the way the configuration is handled in couchdb,
>>>> making it more evented. Which is better than the current version but
>>>> still rely on an INI file.
>>> 
>>> We did change the API to enforce some basic patterns to avoid common
>>> pitfalls in hot code upgrades but the fundamental operation is still
>>> the same. Unfortunately, as you point out, this doesn't change our
>>> reliance on INI files.
>>> 
>>>> - logs are now handled by twig a specific application created by
>>>> Cloudant for that purpose.
>>>> 
>>> 
>>> We're looking to switch to lager as well. Twig was written pre-Lager
>>> and we haven't made the switch because it hasn't been a priority yet.
>>> We've got Lager queued up as a task to complete before the final
>>> merge.
>>> 
>>>> 
>>>> rcouch changes consist mostly in the following:
>>>> 
>>>> - making Apache CouchDB an OTP release. It also extracts some code and
>>>> revisits the supervision like  couch_httpd to transform it as a
>>>> standalone application, and make couch_replicator a full app
>>> 
>>> For the record I removed all of the BigCouch stuff related to Erlang
>>> release handling because we started the merge when autotools was still
>>> a requirement. We do Erlang releases exclusively at Cloudant and
>>> switching to this is now going to be part of the merge. We do have
>>> some differences in some of the build tools but the rcouch build
>>> system isn't significantly different than our internal builds. We'll
>>> switch to something much more rcouch like before the merge.
>>> 
>>>> - add some features as standalone Erlang apps
>>>> - adding view changes: which edit couch_mrview, couch_index,
>>>> couch_replicator by adding new features without changing the current
>>>> one or internals
>>>> - changes the core by adding some optimizations and new features: remove
>>>> ref counting, add new caching, add validate_doc_read. Actually
>>>> these optimizations have not been merge in view of using those in
>> bigcouch.
>>>> The collation has been replaced by a nif available as a standalone
>>>> Erlang application couch_collate.
>>> 
>>> Some of these I've skimmed and others I haven't spent any time looking
>>> at. The couch_collate change isn't a huge issue. I may suggest some
>>> optimizations in the future but the general idea is solid. The caching
>>> and validate_doc_read changes I'll have to review when I see them in
>>> commits. I didn't notice anything related on 1994-merge-rcouch when I
>>> reviewed it last week but I was mostly skimming for major things.
>>> 
>>>> - A lot of changes have been done in the build process
>>> 
>>> Quite a lot. We're both rebar centric so I'm not too concerned. I'd
>>> like to see a bit more thought on some of the ICU and SpiderMonkey
>>> aspects but nothing here is overly controversial I don't think. Maybe
>>> some UI things could use some polish.
>>> 
>>>> - logs are handled by lager an application created by Basho commonly
>>>> used these days in the Erlang world.
>>>> 
>>> 
>>> +1
>>> 
>>>> So rcouch and bigcouch are conflicting on the following right now:
>>>> 
>>>> - couch_replicator
>>> 
>>> I'm not sure I've seen what the actual changes in rcouch are. I didn't
>>> look too hard, but for the most part Cloudant changes are bug fixes
>>> and making the _replicator db work in a cluster. I wouldn't think
>>> there'd be major conflicts here.
>>> 
>>>> - couch_index, couch_mrview
>>> 
>>> This isn't a conflict because we don't use these internally yet. Work
>>> to start using this is queued up to happen in the next few weeks.
>>> 
>>>> - internals: some parts of the code. the collation, jiffy changes are
>>>> not the same, ... .
>>> 
>>> Collation and Jiffy are trivial changes. Jiffy I actually pulled out
>>> of BigCouch cause ejson was still a thing and I was trying to minimize
>>> the change. couch_collate looks mostly fine to me so I'm not overly
>>> concerned about that. The ref counting changes fou
>>> 
>>>> - build process
>>>> - logging system
>>>> 
>>>> 
>>>> The goal of making couchdb embedabble in another Erlang application
>>>> is not achieved in bigcouch and still difficult with rcouch.
>>>> 
>>>> - Difficult with rcouch because the configuration is actually strongly
>>>> based on settings passed via an ini file.
>>>> - Difficult with bigcouch  because of the internal changes implying the
>>>> usage by default of a cluster (like loading ddocs at initialization).
>>>> Also it has the same problem with the ini files.
>>>> - Both handle authorization at the core level instead of handling
>>>> it in a different layer mixing code between the internal api and the
>>>> HTTP API.
>>>> 
>>>> If we are able to embed the couchdb core in any Erlang application, it
>>>> will considerably help us to merge bigcouch and rcouch quietly, easily,
>>>> with people working in parallel to achieve it. Building a plugin will be
>>>> a lot easier as well. The good thing is that both projects have the
>>>> roots to do it, it is a matter to merge some code and refactor some
>>>> internals:
>>>> 
>>> 
>>> I mostly agree with these comments. The reliance on INI files
>>> definitely prevents the ease of embedding the core storage engine.
>>> 
>>> One thing, you misread the code for the default clustering a bit. The
>>> cluster assumption is only for database names that follow the pattern
>>> "shards/00000000-ffffffff/dbname.couch". Anything not following that
>>> pattern works just fine locally. Remember that we have the entire
>>> CouchDB single node test suite passing (minus a specific test or two
>>> for known reasons). We also use the single node APIs quite a bit for
>>> administration so its in our interest to make sure that continues to
>>> work.
>>> 
>>> Definitely agree on the authorization issues. Russel Branca has
>>> recently been suggesting that we undo the core ties to user contexts
>>> for other reasons. I don't think this is an "if" so much as a "when"
>>> question.
>>> 
>>>> - Have a clean internal API. Fabric the cluster library of bigcouch is
>>>> adding 2  modules wrapping the couchdb API  (couch_db, couch_stream,
>>>> ...) `fabric_rpc` and `fabric_rpc2`.  couch_httpd* modules are also
>>>> wrapping it to send the result to the HTTP clients. Both are adding
>>>> a layer over the internals just to be able to use the core which is
>> really
>>>> inefficient. It forces Erlang application that want to call directly
>>>> couchdb (like couchc [1]) to also wrap these internals to make them
>>>> useful. We should on the contrary offer a clean API at the core level
>>>> usable by fabric or any Erlang App. The API offered by the `fabric_rpc*`
>>>> or `couchc` should be our internals.
>>>> 
>>> 
>>> I'm not immediately familiar with couchc but fabric.erl mostly
>>> provides a clean API for everything needed for normal client access.
>>> There have been general discussions on writing a single node version
>>> of that module for exactly the same reasons you state. For instance,
>>> it'd be nice to have either chttpd or couch_httpd. Having both is just
>>> a waste. If we had two modules that would switch back and forth
>>> between single node vs. clustered would make that significantly
>>> easier.
>>> 
>>>> So I propose as a first merge action to refactor the internal API so
>>>> it can be used directly by fabric and future Erlang applications
>>>> embedding or calling directly couchdb.
>>>> 
>>> 
>>> I disagree with this priority. We've already got enough going on and
>>> there's really not that much conflict (so far as I can tell) with
>>> internal APIs. Adding a "fix the thing we haven't fixed in years" step
>>> sounds like a recipe for disaster. This is definitely something we
>>> should do going forward I just don't think its a precondition for the
>>> merge work.
>>> 
>>>> - Make the auth* apart - remove all `#user_ctx{}` calls from
>>>> the couch_* modules- to handle the it in the transport or application
>>>> level. HTTP for now.By doing that we make the `couch` application
>>>> completely independent of the transport, so any erlang app can embed it
>>>> and propose its own API to enter the data. (just like mnesia). It will
>>>> also remove all the extra code we have to force the authorization by
>>>> faking the user context when it needs to use the internal api. The
>>>> auth* should be provided as an extension imo.
>>>> 
>>> 
>>> I agree and Russel has ideas for things quite like this. Again not
>>> something I think should be a precondition for the merge. (Not sure if
>>> you're proposing the precondition part here though).
>>> 
>>>> - removing validate_* initialization in the core db level and let it
>>>> to the transport or the application. For example it could be done when
>>>> opening the DB via the HTTP api by wrapping the open_db call. Some
>>>> applications may not need it at all. The core should still provide
>>>> hooks to do these actions, but the way the hooks are passed to the
>>>> database should be an application decision. couch_httpd is an
>>>> application exposing the couchdB API to other via HTTP.
>>>> 
>>> 
>>> I'll have to look closer at validate_doc_read to get an idea on this.
>>> At first I also think this is also in the "good idea but not a merge
>>> requirement" but that may change when reading the validate_doc_read
>>> implementation. We specifically rely on not requiring the functions
>>> for reads so I'll have to think harder on this.
>>> 
>>>> - adding special databases hooks only on the application level. Right
>>>> now we have specific databases handled in the core code (`_users` and
>>>> and `_replicator`). We should instead pass this hook on another level,
>>>> like when we initialize the couch_replicator application for example.
>>>> 
>>> 
>>> I'm not sure what you mean here. I 100% agree that we should aim to
>>> remove the special logic for these databases in core. I'd prefer to
>>> see their "database likenesses" removed entirely but that's a
>>> different story for a different thread.
>>> 
>>>> - making the configuration created by an INI file optional. We could
>> have
>>>> a default configuration set via the API that can be feed by later by
>>>> the ini file or any other config system.
>>>> 
>>> 
>>> I definitely agree that there's config improvements to be made. One of
>>> the things I'd like to see is defaults in code along with the
>>> possibility for enforcing type constraints (ie, reject updates that
>>> change an integer to a string or vice versa).
>>> 
>>>> - merging any optimizations at the same time (ref counting replacement).
>>>> 
>>> 
>>> These sorts of things I think we'll just have to look at on a
>>> case-by-case basis.  The fact that both rcouch and BigCouch needed to
>>> rewrite these bits shows that its an issue. Its just a matter of
>>> making sure that the merge supports all the required use cases.
>>> 
>>>> 
>>>> Imo  if we do the work above it will allows us to speed the merge of
>>>> both rcouch and bigcouch. The second step would be having the fabric api
>>>> to use this clean api, and modifying couch_htpd and couch_replicator to
>>>> use it. Bonus point, it would ease unitesting by isolating the cluster
>>>> part from the core.
>>>> 
>>> 
>>> While there are definitely some warts on the clustering APIs I would
>>> argue that 95% of the clustering code is isolated from core. There's
>>> definitely a lot more logic in fabric for clustering than the few
>>> places we need to call fabric from core. Though I totally agree that
>>> having 100% of core not care about fabric would be ideal.
>>> 
>>> On the other hand, I disagree that any of these as a precondition for
>>> merging would speed the merge. They're all things we should be working
>>> towards and definitely things we should be thinking about while
>>> working on the merge, but nothing in here strikes me as a road block
>>> to merging. They help us clean up lots of things post-merge like
>>> having two extremely close HTTP layers but while that duplication
>>> stinks to high heavens, its worked well enough for 6+ years of
>>> production deployments.
>>> 
>>>> I probably forgot some features that could also be done during the
>>>> merge, like merging couch_index and couch_mrview in one app or at least
>>>> having `_all_docs` handled by couch_httpd but these changes can be done
>>>> at another level imo. Also the `_all_dbs` is handled differently, in
>>>> bigcouh it is handled by recording all the dbs in a database, in couchdb
>>>> we are right now only relying only on the fs. The bigcouch solution
>>>> should be the default imo.
>>>> 
>>> 
>>> These are also good things to consider. I'm definitely +1 on the
>>> couch_index/couch_mrview refactoring as well as moving _all_docs
>>> around. That said its definitely not a priority for the merge. We have
>>> two major forks to merge back into mainline. I'd prefer we focus on
>>> the minimal amount of effort required to make that happen so that we
>>> can then focus on some of these other bigger things.
>>> 
>>> RIght now I think the focus should be exclusively on getting the three
>>> code bases merged back to master in a fashion that allows us to start
>>> iterating on all of the things you've listed. That means taking the IP
>>> cleared code we have and doing the legwork to get it all onto master
>>> in such a fashion that we can continue making progress. Once that's
>>> done I think we can start moving towards making changes to simplify
>>> internals and speed development.
>>> 
>>> Or to phrase that slightly differently, I think we should be focusing
>>> on minimizing the amount of effort needed to make the merge work. I
>>> think all of your proposals here are solid and something we should
>>> pursue but I think they would make the merge more complicated rather
>>> than simpler.
>>> 
>>>> 
>>>> Anyway, I wish we can start this work ASAP rather than just working on 2
>>>> branches in parallel.
>>>> 
>>> 
>>> I definitely agree. I think the multi-repo approach will help us focus
>>> on what's actually in conflict and then we can move towards resolving
>>> those changes. I know the code bases look quite different at the
>>> moment but from all that I've read so far, these are mostly
>>> superficial issues.
>>> 
>>> For instance, the rcouch build system is significantly different than
>>> the bigcouch version but that's due to me wiring the bigcouch build
>>> into autotools. Here we'll mostly default towards the rcouch system.
>>> We'll have to iron out some details but its nothing insurmountable.
>>> The index work in rcouch will mostly be pulled directly into bigcouch.
>>> At the merge we may not have 100% support for the newer features but
>>> that just means we may cut another single node release before we cut
>>> one that has API parity between clustered and single node.
>>> 
>>>> Any feedback is welcome,
>>>> 
>>>> - benoit
>>>> 
>>>> [1] http://github.com/benoitc/couchc
>> 
>> 


Mime
View raw message