Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DC088F82F for ; Fri, 31 May 2013 11:08:07 +0000 (UTC) Received: (qmail 16497 invoked by uid 500); 31 May 2013 11:08:07 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 16461 invoked by uid 500); 31 May 2013 11:08:07 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 16437 invoked by uid 99); 31 May 2013 11:08:06 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 May 2013 11:08:06 +0000 Received: from localhost (HELO mail-ie0-f174.google.com) (127.0.0.1) (smtp-auth username nslater, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 May 2013 11:08:05 +0000 Received: by mail-ie0-f174.google.com with SMTP id aq17so3543209iec.19 for ; Fri, 31 May 2013 04:08:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=zF3uhJi5m/pAMZg3GXtHnYqYCkrYAoseTGP6CsEdYsA=; b=SrJtrr43pKbdsYHAuRXq5Vax2bLWzRwXfUo0nUwcITOJ45eEiZDveRyVgjY9RvFy28 u8KRAWyIlMQ5PyS0Iq7HC8sgcpiVHP+DyRSzmBLoQebUMn9MTRRGi+SGS+eDN05JyPtd T/y3RI4MbzWPuWt/VB6SukSLWgoEkgt1Xw6PmPueV7A6uYGlHFQ2CDsiC4x98pDJZKDA s+gRawHYkK5TVldYPG0cl0aVnO+vN7gMDpWj7uDMEiCDbgb9ZCqlwt+g9jbziff4vg8z 1/2W2dzk1lrCZUJLeOEZM+aeX9ZXunn0dR4QM/SyEX2VPW/aTSO/dqlh9Cxscm4ZcLEF 7NEg== MIME-Version: 1.0 X-Received: by 10.50.120.68 with SMTP id la4mr1474436igb.49.1369998485138; Fri, 31 May 2013 04:08:05 -0700 (PDT) Received: by 10.50.8.68 with HTTP; Fri, 31 May 2013 04:08:05 -0700 (PDT) X-Originating-IP: [178.250.115.206] In-Reply-To: References: Date: Fri, 31 May 2013 12:08:05 +0100 Message-ID: Subject: Re: IP clearance (Was: Re: [VOTE] Merge BigCouch) From: Noah Slater To: "dev@couchdb.apache.org" Content-Type: multipart/alternative; boundary=047d7ba979780d304604de01a1bc X-Gm-Message-State: ALoCoQmmeO3jYXTM9lCSDoLYfVbcQtEDGITD0pFhlPaTJsbTiRtLJUMjaSYcrLfuaRLfQAG3F5c6 --047d7ba979780d304604de01a1bc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hoping on to IRC. On 31 May 2013 11:54, Robert Newson wrote: > FYI: I've added the first version of couchdb-bigcouch.xml to the > incubator site. I've been unable to publish it and I think I need some > assistance from Noah or Jan to move on. > > B. > > > On 16 May 2013 13:40, Noah Slater wrote: > > You can take it up to step 5. I have to do step 6 and onwards. > > > > > > On 16 May 2013 13:39, Robert Newson wrote: > > > >> nah, I'll do it straight. > >> > >> Can I do this? The docs say Officer or Member. > >> > >> On 16 May 2013 13:37, Noah Slater wrote: > >> > git help svn > >> > > >> > > >> > On 16 May 2013 13:13, Robert Newson wrote: > >> > > >> >> Righto. Now to remember how subversion works... > >> >> > >> >> On 15 May 2013 17:09, Noah Slater wrote: > >> >> > Okay. > >> >> > > >> >> > Start here: > >> >> > > >> >> > http://incubator.apache.org/ip-clearance/ > >> >> > > >> >> > Then make a copy of this file: > >> >> > > >> >> > > >> >> > >> > http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearan= ce/ip-clearance-template.xml > >> >> > > >> >> > This file, when rendered to HTML will look like: > >> >> > > >> >> > > http://incubator.apache.org/ip-clearance/ip-clearance-template.html > >> >> > > >> >> > In your local copy, cut everything from: > >> >> > > >> >> >
-----8-<---- cut here -------8-<---- cut here
> >> >> > -------8-<---- cut here-------8-<----
> >> >> > > >> >> > To: > >> >> > > >> >> >
-----8-<---- cut here -------8-<---- cut here
> >> >> > -------8-<---- cut here-------8-<----
> >> >> > > >> >> > Now, add your copy back to Subversion here: > >> >> > > >> >> > > >> >> > >> > http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearan= ce/ > >> >> > > >> >> > Call it "couchdb-bigcouch.xml". > >> >> > > >> >> > In a few minutes, this will appear here: > >> >> > > >> >> > http://incubator.apache.org/ip-clearance/couchdb-bigcouch.html > >> >> > > >> >> > Now, it should be a simple matter of going through the doc and > >> completing > >> >> > the checkpoints/sections. > >> >> > > >> >> > Here are the two previous ones we've done: > >> >> > > >> >> > http://incubator.apache.org/ip-clearance/couchdb-docs.html > >> >> > > >> >> > http://incubator.apache.org/ip-clearance/couchdb-fauxton.html > >> >> > > >> >> > Let me know if you get stuck on any of the checkpoints. > >> >> > > >> >> > Once you're done, let me know, and I will use my member karma to > push > >> it > >> >> > through the Incubator. > >> >> > > >> >> > Benoit, you may as well start your rcouch stuff at the same time > using > >> >> this > >> >> > instructions. Obviously, you should pick "couchdb-rcouch.xml" > instead. > >> >> But > >> >> > other than that, it's the same process. > >> >> > > >> >> > On 15 May 2013 16:24, Noah Slater wrote: > >> >> > > >> >> >> I can help! :) > >> >> >> > >> >> >> > >> >> >> On 15 May 2013 16:23, Robert Newson wrote: > >> >> >> > >> >> >>> :) > >> >> >>> > >> >> >>> Jan, I think you said you'd help start the IP clearance bit? > >> >> >>> > >> >> >>> On 15 May 2013 15:03, Noah Slater wrote: > >> >> >>> > PARTY TIME =F0=9F=8E=89 > >> >> >>> > > >> >> >>> > > >> >> >>> > On 15 May 2013 10:40, Robert Newson > wrote: > >> >> >>> > > >> >> >>> >> Thanks everyone. > >> >> >>> >> > >> >> >>> >> The tally is; > >> >> >>> >> > >> >> >>> >> 13 +1's > >> >> >>> >> > >> >> >>> >> The vote passes. We'll now move on to IP clearance. Once > that's > >> done > >> >> >>> >> the work will arrive on a feature branch in our main git > >> repository. > >> >> >>> >> > >> >> >>> >> B. > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> On 13 May 2013 04:31, Jason Smith wrote: > >> >> >>> >> > Sorry, just catching up. > >> >> >>> >> > > >> >> >>> >> > +1 > >> >> >>> >> > > >> >> >>> >> > On Fri, May 10, 2013 at 4:29 PM, Jan Lehnardt < > jan@apache.org> > >> >> >>> wrote: > >> >> >>> >> >> +1 > >> >> >>> >> >> > >> >> >>> >> >> Jan > >> >> >>> >> >> -- > >> >> >>> >> >> > >> >> >>> >> >> On May 7, 2013, at 21:34 , Robert Newson < > rnewson@apache.org> > >> >> >>> wrote: > >> >> >>> >> >> > >> >> >>> >> >>> Hi All, > >> >> >>> >> >>> > >> >> >>> >> >>> I propose to merge in the following work, > >> >> >>> >> >>> > >> >> https://github.com/rnewson/couchdb/tree/nebraska-merge-candidateto > >> >> >>> >> >>> the official Apache CouchDB repository to a new branch > (i.e, > >> >> *not* > >> >> >>> >> >>> master). Once there, the full CouchDB developer communit= y > can > >> >> begin > >> >> >>> >> >>> the work to incorporate the code here into an official > >> release. > >> >> >>> >> >>> > >> >> >>> >> >>> You do not need to respond if you are in agreement. If > there > >> is > >> >> no > >> >> >>> >> >>> response in 72 hours, I will assume lazy consensus. If w= e > >> reach > >> >> >>> >> >>> consensus, I will start the IP clearance process and the= n > the > >> >> >>> merge. > >> >> >>> >> >>> > >> >> >>> >> >>> As most of you know, Paul Davis and I recently sequester= ed > >> >> >>> ourselves > >> >> >>> >> >>> away from society (in a place called Nebraska) to make > this > >> >> merge > >> >> >>> >> >>> happen. I want to clarify that this work is not the > BigCouch > >> >> code > >> >> >>> you > >> >> >>> >> >>> can see on github.com/cloudant/bigcouch but the Cloudant > >> >> platform > >> >> >>> from > >> >> >>> >> >>> which BigCouch was made. This means it is bang up to dat= e > >> with > >> >> all > >> >> >>> the > >> >> >>> >> >>> bug fixes and feature enhancements we've made in the las= t > >> >> eighteen > >> >> >>> >> >>> months or more. With that clarification made, here are o= ur > >> notes > >> >> >>> about > >> >> >>> >> >>> what we achieved, what it means to the project and what > isn't > >> >> yet > >> >> >>> >> >>> done; > >> >> >>> >> >>> > >> >> >>> >> >>> Nebraska Merge Roundup > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Stats: > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> 1402 - total new commits > >> >> >>> >> >>> > >> >> >>> >> >>> 312 - commits written during the merge (will be reduced > >> >> >>> substantially > >> >> >>> >> >>> by squashing) > >> >> >>> >> >>> > >> >> >>> >> >>> 408 - number of files changed > >> >> >>> >> >>> > >> >> >>> >> >>> 21,897 - number of lines added > >> >> >>> >> >>> > >> >> >>> >> >>> 4,277 - number of lines removed > >> >> >>> >> >>> > >> >> >>> >> >>> A retrospective: > >> >> >>> >> >>> > >> >> >>> >> >>> Bob Newson and I have come to the end of our merge sprin= t > on > >> >> >>> getting > >> >> >>> >> >>> BigCouch merged into Apache CouchDB. Its been a producti= ve > >> ten > >> >> days > >> >> >>> >> >>> here in the midwest. I managed to get Bob out to a bowli= ng > >> alley > >> >> >>> and > >> >> >>> >> >>> he managed to get me to a sushi restaurant. In between t= he > >> >> cultural > >> >> >>> >> >>> exchanges we=E2=80=99ve also managed to get a significan= t amount > of > >> work > >> >> >>> done > >> >> >>> >> >>> on the merging as well. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> The current status of the merge is that we=E2=80=99ve ma= naged to > >> resolve > >> >> >>> the > >> >> >>> >> >>> differences in the single node execution of CouchDB. Bot= h > the > >> >> >>> >> >>> JavaScript and Erlang test suites run with only one > failure > >> in > >> >> the > >> >> >>> >> >>> Erlang test suite due to a (deliberately) missing > constraint > >> on > >> >> the > >> >> >>> >> >>> number of operating system processes. This should be a > >> >> relatively > >> >> >>> >> >>> straightforward fix but was not prioritized during our > >> limited > >> >> >>> time to > >> >> >>> >> >>> work on the larger issues. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We merged a large number of performance and stability > >> >> enhancements > >> >> >>> >> >>> back into single node CouchDB as well as a number of pur= e > bug > >> >> >>> fixes. > >> >> >>> >> >>> The biggest highlight is a brand new compactor that is > both > >> >> faster > >> >> >>> and > >> >> >>> >> >>> creates smaller and better organized post-compaction > >> databases. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> The current status of the merge is that single node > >> operations > >> >> >>> should > >> >> >>> >> >>> be completely unaffected as demonstrated by the test sui= te > >> >> >>> passing. On > >> >> >>> >> >>> the other hand we haven=E2=80=99t yet finished getting t= he > clustered > >> >> code > >> >> >>> >> >>> merged to use some of the new changes in single node > CouchDB. > >> >> The > >> >> >>> >> >>> single most significant portion of this work involves > >> updates to > >> >> >>> the > >> >> >>> >> >>> internal cluster API for views to use the recently > rewritten > >> >> >>> indexer > >> >> >>> >> >>> APIs. This should be a relatively straightforward bit of > work > >> >> that > >> >> >>> >> >>> we=E2=80=99ll be finishing over the next few weeks. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> All in all the merge work done so far has been quite > >> successful. > >> >> >>> We=E2=80=99ve > >> >> >>> >> >>> met our primary goal of getting the code merged in a > fashion > >> >> that > >> >> >>> does > >> >> >>> >> >>> not affect single node operation while providing a > starting > >> >> point > >> >> >>> for > >> >> >>> >> >>> the larger community to start reviewing the more > significant > >> >> >>> changes > >> >> >>> >> >>> made. Given the size of the diff between the two code > bases > >> we > >> >> >>> never > >> >> >>> >> >>> expected to have a fully working clustered solution afte= r > ten > >> >> days > >> >> >>> of > >> >> >>> >> >>> work but we have succeeded in providing a base of work > that > >> will > >> >> >>> allow > >> >> >>> >> >>> us and new contributors to get up to speed quickly. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> This work, coupled with work by Dave Cottlehuber and > Beno=C3=AEt > >> >> >>> Chesneau > >> >> >>> >> >>> on updating the build system and various other internal > >> updates, > >> >> >>> will > >> >> >>> >> >>> provide a solid foundation for work going forward. Its a= n > >> >> exciting > >> >> >>> >> >>> time for CouchDB and anyone interested should keep an ey= e > on > >> the > >> >> >>> next > >> >> >>> >> >>> few releases as we ramp up work on various core aspects = of > >> the > >> >> >>> >> >>> database. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We=E2=80=99ve had an exciting few days working to prepar= e the road > >> for > >> >> an > >> >> >>> >> >>> exciting next twelve to eighteen months. We hope that > >> everyone > >> >> will > >> >> >>> >> >>> feel as excited as we do about the next twelve to eighte= en > >> >> months > >> >> >>> for > >> >> >>> >> >>> Apache CouchDB. It should be an exciting ride. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Things we got done > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> * Large update to the source tree layout for Erlang > >> >> applications. > >> >> >>> Each > >> >> >>> >> >>> application now has a src/appname/(c_src|ebin|priv|src) > >> >> structure. > >> >> >>> The > >> >> >>> >> >>> build system has been updated. > >> >> >>> >> >>> > >> >> >>> >> >>> * Renamed src/couchdb to src/couch to match the Erlang > >> >> convention > >> >> >>> of > >> >> >>> >> >>> the top directory name matching the Erlang application > name. > >> >> >>> >> >>> > >> >> >>> >> >>> * Imported Cloudant Erlang applications for clustered > >> CouchDB. > >> >> >>> These > >> >> >>> >> >>> are imported with their history by using git subtree and > >> merging > >> >> >>> the > >> >> >>> >> >>> top level commit. These are not external deps, developme= nt > >> will > >> >> >>> happen > >> >> >>> >> >>> within the CouchDB tree. The imported apps are: > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> * config - A couch_config replacement (Behavior is > mostly > >> >> >>> identical > >> >> >>> >> >>> to couch_config except how we listen for configuration > >> changes > >> >> >>> >> >>> internally to allow for smooth hot code upgrade). > >> >> >>> >> >>> > >> >> >>> >> >>> * twig - An rsyslog source replacement for couch_log. > >> >> >>> >> >>> > >> >> >>> >> >>> * rexi - An RPC library. Replaces Erlang=E2=80=99s bui= lt-in rex > >> >> >>> application > >> >> >>> >> >>> to avoid costly safety measures in the interest of > >> performance > >> >> and > >> >> >>> >> >>> throughput. > >> >> >>> >> >>> > >> >> >>> >> >>> * mem3 - The =E2=80=9CDynamo=E2=80=9D part of BigCouch= responsible for > >> >> managing > >> >> >>> >> cluster state > >> >> >>> >> >>> > >> >> >>> >> >>> * fabric - The internal cluster-aware CouachDB API > >> >> >>> >> >>> > >> >> >>> >> >>> * ets_lru - A small library application that provides = an > >> LRU > >> >> >>> >> >>> implementation using a couple ets tables. > >> >> >>> >> >>> > >> >> >>> >> >>> * ddoc_cache - Caches design documents on each node fo= r > >> use in > >> >> >>> >> >>> design handler functions. This uses an ets_lru cache wit= h > a > >> very > >> >> >>> short > >> >> >>> >> >>> TTL. > >> >> >>> >> >>> > >> >> >>> >> >>> * chttpd - The cluster aware HTTP layer > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Each imported app also had its build system updated to u= se > >> >> >>> Autotools > >> >> >>> >> >>> along with the necessary updates noted above for the new > >> >> >>> application > >> >> >>> >> >>> layouts for existing CouchDB erlang apps. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> * Merged a large amount of updates and fixes to > >> couch_replicator > >> >> >>> based > >> >> >>> >> >>> on work done internally at Cloudant. Unfortunately due t= o > an > >> >> error > >> >> >>> >> >>> when we created our internal clone we lost a bit of > history > >> in > >> >> >>> some of > >> >> >>> >> >>> the initial merge and have a big commit that affects > >> >> >>> >> >>> couch_replicator_manager mostly. There are a number of > other > >> >> >>> commits > >> >> >>> >> >>> related to couch_replicator that resolve the single node > vs. > >> >> >>> clustered > >> >> >>> >> >>> differences. Some noticeable couch_replicator features: > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> * Optionally disable checkpoints so that replication c= an > >> work > >> >> >>> when > >> >> >>> >> >>> a source is read only. This should only be used for > smaller > >> >> >>> databases > >> >> >>> >> >>> as each replication call has to scan the entire source > >> database > >> >> on > >> >> >>> >> >>> each invocation. > >> >> >>> >> >>> > >> >> >>> >> >>> * A new changes_pending field in the _active_tasks > output > >> >> >>> >> >>> > >> >> >>> >> >>> * A fix to the continuous replication to automatically > >> >> reconnect > >> >> >>> to > >> >> >>> >> >>> a continuous changes feed when it sees a last_seq value. > This > >> >> >>> allows > >> >> >>> >> >>> for the source to selectively recycle the HTTP connectio= ns > >> used > >> >> >>> which > >> >> >>> >> >>> can be quite useful for =E2=80=9Cpermanent=E2=80=9D repl= ications. > >> >> >>> >> >>> > >> >> >>> >> >>> * A multitude of smaller bug fix and stability > >> enhancements. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Updates to single node couch: > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> * We changed the by_seq tree to store a copy of the > >> >> >>> #full_doc_info{} > >> >> >>> >> >>> record instead of the #doc_info{} record. This gives > >> significant > >> >> >>> speed > >> >> >>> >> >>> improvements for compaction and replication and generall= y > >> >> anything > >> >> >>> >> >>> that needs to walk the by_seq tree and access document > bodies > >> >> >>> >> >>> internally. > >> >> >>> >> >>> > >> >> >>> >> >>> * We rewrote the compactor to be significantly faster as > >> well as > >> >> >>> >> >>> provides significantly better compacted databases. The t= wo > >> main > >> >> >>> halves > >> >> >>> >> >>> are to use a temp file and replace the use of btrees in > the > >> temp > >> >> >>> file. > >> >> >>> >> >>> The temp file only contains a temporary copy of the > document > >> >> ids. > >> >> >>> At > >> >> >>> >> >>> the end of a compaction run we then rebuild the by_id > btree > >> in > >> >> the > >> >> >>> >> >>> compaction file from this temp file. The reason this > helps so > >> >> much > >> >> >>> is > >> >> >>> >> >>> that the compaction is based on the update_seq btree, > which > >> for > >> >> >>> most > >> >> >>> >> >>> cases means that the id tree is updated in roughly rando= m > >> order > >> >> >>> which > >> >> >>> >> >>> is very bad for our append only btrees. By using the tmp > >> file we > >> >> >>> can > >> >> >>> >> >>> stream it in order back into the compacted db file at th= e > >> end of > >> >> >>> >> >>> compacting, generating a minimum amount of garbage in th= e > >> >> process. > >> >> >>> The > >> >> >>> >> >>> other upgrade was to implement an external merge sort > module > >> >> >>> >> >>> (couch_emsort) that is used with this temporary file. > >> >> >>> >> >>> > >> >> >>> >> >>> * Reject updates to design docs that introduce updates > that > >> >> break > >> >> >>> >> >>> compilation for source code. Currently we only check map > and > >> >> reduce > >> >> >>> >> >>> calls as the other should provide user visible errors > >> instead of > >> >> >>> >> >>> inexplicably empty views. > >> >> >>> >> >>> > >> >> >>> >> >>> because my OCD kicked in and I was unable to resist. > >> >> >>> >> >>> > >> >> >>> >> >>> * Reverted a change made a long time ago that uses two > file > >> >> >>> >> >>> descriptors for each database. See the todo list. > >> >> >>> >> >>> > >> >> >>> >> >>> * The reason to remove the second fd is so that we can > >> rewrite > >> >> ref > >> >> >>> >> >>> counting. Better ref counting makes everyone happy, but > the > >> real > >> >> >>> >> >>> reason is for this next bullet point: > >> >> >>> >> >>> > >> >> >>> >> >>> * Optimize couch_server to not require a round trip > message > >> pass > >> >> >>> for > >> >> >>> >> >>> opening a database that=E2=80=99s in the LRU. This is a > significant > >> >> >>> >> >>> performance boost for high concurrency access. We also > >> optimized > >> >> >>> >> >>> couch_server internals to not blow up when it=E2=80=99s = under > load. > >> >> >>> >> >>> > >> >> >>> >> >>> * Introduce a #leaf{} record into the revision trees. > This is > >> >> never > >> >> >>> >> >>> written to disk but makes internal code a lot cleaner wh= en > >> >> dealing > >> >> >>> >> >>> with multiple versions of rev tree values. > >> >> >>> >> >>> > >> >> >>> >> >>> * Some changes to couch_changes to enable clustered > access. > >> Also > >> >> >>> some > >> >> >>> >> >>> general cleanup > >> >> >>> >> >>> > >> >> >>> >> >>> * Internal changes to how CouchDB is booted in Erlang > land. > >> Not > >> >> >>> very > >> >> >>> >> >>> sexy but this removes a lot of complicated un-Erlangy > bits. > >> We > >> >> >>> still > >> >> >>> >> >>> have a bit of work left here. > >> >> >>> >> >>> > >> >> >>> >> >>> * btree chunk sizes are now configurable which can allow > >> people > >> >> to > >> >> >>> >> >>> adjust the RAM/speed tradeoffs a bit more. > >> >> >>> >> >>> > >> >> >>> >> >>> * We now load update validation functions on the first > write. > >> >> This > >> >> >>> is > >> >> >>> >> >>> a cluster-motivated change because the clustered version > of > >> this > >> >> >>> call > >> >> >>> >> >>> is expensive and can lead to race conditions when openin= g > a > >> >> bunch > >> >> >>> of > >> >> >>> >> >>> db shards simultaneously. This should be invisible to > >> external > >> >> >>> >> >>> clients. > >> >> >>> >> >>> > >> >> >>> >> >>> * Disabled conflict detection for local docs. They don= =E2=80=99t > >> >> replicate > >> >> >>> so > >> >> >>> >> >>> there=E2=80=99s no point. This just led to clusters gett= ing stuck > and > >> >> >>> confused > >> >> >>> >> >>> when there were lots of replications happening. > >> >> >>> >> >>> > >> >> >>> >> >>> * Changes to the multipart/mime parsing code. Necessary > for > >> >> >>> clustered > >> >> >>> >> >>> attachment uploads to split the incoming data stream > into N > >> >> >>> copies. > >> >> >>> >> >>> > >> >> >>> >> >>> * Don=E2=80=99t use init:restart/0 when reloading the IC= U driver. > I > >> >> think > >> >> >>> >> >>> this has a bug. But we should rewrite this driver to be = a > NIF > >> >> >>> anyway. > >> >> >>> >> >>> > >> >> >>> >> >>> * New couch OS process manager. Significantly faster > access > >> to > >> >> OS > >> >> >>> >> >>> processes under heavy load. This replaces the hard limit > >> with a > >> >> >>> soft > >> >> >>> >> >>> limit. Process spawned over the soft limit will be used > until > >> >> >>> they=E2=80=99ve > >> >> >>> >> >>> sat idle for a few minutes and then be closed. We have a > todo > >> >> item > >> >> >>> to > >> >> >>> >> >>> add the hard ceiling back in (while keeping the soft > >> ceiling). > >> >> >>> >> >>> > >> >> >>> >> >>> * Automatically replace some easily identifiable JS > >> reductions > >> >> with > >> >> >>> >> >>> their builtin counterparts. Uses a regex to do the > detection > >> so > >> >> its > >> >> >>> >> >>> not too smart. > >> >> >>> >> >>> > >> >> >>> >> >>> * Improved view updater write batch. > >> >> >>> >> >>> > >> >> >>> >> >>> * Updates to couchjs=E2=80=99 views.js to improve index = update > speeds > >> >> >>> >> >>> > >> >> >>> >> >>> * Updates to the _stats bultin reduce to allow reduces t= o > >> work > >> >> over > >> >> >>> >> >>> emitted stats objects. Sometimes clients have summary da= ta > >> in a > >> >> >>> doc, > >> >> >>> >> >>> and this allows them to combine stats if they follow the > same > >> >> >>> pattern > >> >> >>> >> >>> as the builtin expects. > >> >> >>> >> >>> > >> >> >>> >> >>> * Added a config:reload() that is accessible by POST=E2= =80=99ing > to > >> >> >>> >> >>> _config/_reload. Used by the JS tests to reset the confi= g > to > >> >> >>> what's on > >> >> >>> >> >>> disk. This should prevent those test run failures where = a > >> test > >> >> >>> fails > >> >> >>> >> >>> leaving the config in a bad state causing all subsequent > >> tests > >> >> to > >> >> >>> >> >>> fail. I think. Maybe. > >> >> >>> >> >>> > >> >> >>> >> >>> * Databases are deleted synchronously in the test suite. > We > >> may > >> >> >>> need > >> >> >>> >> >>> to address this on Windows. But it does seem to reduce t= he > >> >> number > >> >> >>> of > >> >> >>> >> >>> =E2=80=9C{error, file_exists}=E2=80=9D failures. > >> >> >>> >> >>> > >> >> >>> >> >>> * I reimplemented the JS restartServer() function. > There=E2=80=99s a > >> new > >> >> >>> >> >>> _restart/token URL that will given a unique value for ea= ch > >> >> >>> instance of > >> >> >>> >> >>> the Erlang VM. To run a restart we grab the current toke= n > >> value, > >> >> >>> hit > >> >> >>> >> >>> _restart, then wait till we get a successful response > with a > >> >> >>> different > >> >> >>> >> >>> token. This appears to have made the restart strategy mo= re > >> >> robust. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Things that need doing > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> IP Clearance - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We=E2=80=99ll need to track down if we have the CCLA as = well as > look > >> at > >> >> >>> each > >> >> >>> >> >>> source file added to make sure each one is strictly from > >> >> Cloudant > >> >> >>> or > >> >> >>> >> >>> has an amenable license. I=E2=80=99m pretty sure that th= e only > one of > >> >> >>> interest > >> >> >>> >> >>> is trunc_io.erl but we need to be thorough. > >> >> >>> >> >>> > >> >> >>> >> >>> documentation - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> There shouldn=E2=80=99t be much here since the entire po= int of > this > >> >> merge > >> >> >>> was > >> >> >>> >> >>> to not change the visible behavior of single node couch.= A > >> few > >> >> >>> things > >> >> >>> >> >>> to add about the testing endpoints. Maybe an update to t= he > >> >> >>> compaction > >> >> >>> >> >>> section mention the two new file names used. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Copyright notices - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We need to strip out copyright notices from individual > files > >> and > >> >> >>> make > >> >> >>> >> >>> sure all files have a standard Apache License v2 header. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> clustered vhosts - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We=E2=80=99ve never implemented this at Cloudant. We eit= her need > to > >> >> write a > >> >> >>> >> >>> cluster or go back and tell people to use HAProxy (or > >> similar) > >> >> for > >> >> >>> >> >>> such things. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> twig - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We need to add another output type to twig that is > >> configurable > >> >> in > >> >> >>> >> >>> some manner. Right now we spit out entire rsyslog record= s > >> which > >> >> >>> isn=E2=80=99t > >> >> >>> >> >>> useful for most people. We=E2=80=99ll need to implement = the file > >> writer > >> >> >>> from > >> >> >>> >> >>> couch_log as well as update the _log HTTP handler to kno= w > >> when > >> >> it > >> >> >>> can > >> >> >>> >> >>> and can=E2=80=99t expect to find data on disk. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> fabric - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> This is going to need a lot of work. Specifically view > >> access is > >> >> >>> going > >> >> >>> >> >>> to need to be updated to work with couch_mrview and > friends. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Boot a dev cluster - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Once we fix up the clustering code we=E2=80=99ll need to= write > >> >> instructions > >> >> >>> >> >>> and scripts for pulling up a dev cluster. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> OTP stuff - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We=E2=80=99ve updated each app but we still need to pull= some > parts > >> out > >> >> of > >> >> >>> >> >>> couchdb into their own application. Specifically the HTT= P > >> layer > >> >> >>> needs > >> >> >>> >> >>> its own app. We could probably pull out the os > >> >> >>> process/query_servers > >> >> >>> >> >>> as well as the os daemons and friends. Once done we need > to > >> >> update > >> >> >>> the > >> >> >>> >> >>> supervision trees so we don=E2=80=99t have things like c= ouch > starting > >> >> and > >> >> >>> >> >>> managing the replication manager process. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> ddoc_cache - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Wire this up in couch_httpd_db to actually be used. Righ= t > now > >> >> its > >> >> >>> only > >> >> >>> >> >>> used in chttpd. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> couch_file upgrade - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> The revert to remove the second updater_fd from each #db= {} > >> >> record > >> >> >>> >> >>> means that we=E2=80=99re back in the original position o= f files > >> >> appearing > >> >> >>> to > >> >> >>> >> >>> slow down significantly under load. Since the initial > hammer > >> >> >>> approach > >> >> >>> >> >>> of just adding a second fd we=E2=80=99ve since discovere= d that the > >> >> >>> underlying > >> >> >>> >> >>> bug is due to the way that message passing works combine= d > >> with > >> >> >>> >> >>> Erlang=E2=80=99s file io. Significantly though is the fa= ct that > the > >> fix > >> >> is > >> >> >>> >> >>> rather simple to implement. A first draft of this work i= s > on > >> an > >> >> old > >> >> >>> >> >>> branch of mine here: > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> https://github.com/davisp/couchdb/commit/d856878 > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> finish the size calculating changes - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> The #leaf{} record change is to enable us to add more da= ta > >> size > >> >> >>> >> >>> calculations. CouchDB master calculates a data size that > >> account > >> >> >>> for > >> >> >>> >> >>> all bytes that are active in a .couch file. Cloudant is > >> >> interested > >> >> >>> in > >> >> >>> >> >>> the total size of uncompressed docs and attachments minu= s > the > >> >> >>> internal > >> >> >>> >> >>> overhead of btrees. And there=E2=80=99s a fourth number = to > calculate > >> >> based > >> >> >>> on > >> >> >>> >> >>> the compression level used. Having each of these numbers > >> will be > >> >> >>> >> >>> useful as well as the calculations they=E2=80=99ll enabl= e (ie, > dead > >> >> bytes > >> >> >>> in > >> >> >>> >> >>> file, bytes used for overhead, compression ratio achieve= d, > >> etc). > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> couch_proc_manager - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> We need to implement the hard ceiling for capping the > number > >> of > >> >> OS > >> >> >>> >> >>> processes. We=E2=80=99ve started seeing a need for this = at > Cloudant > >> with > >> >> >>> some > >> >> >>> >> >>> work loads so motivation to fix this is high. The only > >> failing > >> >> >>> etap is > >> >> >>> >> >>> the assertion of this ceiling. > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> Synchronous db delete on Windows - > >> >> >>> >> >>> > >> >> >>> >> >>> > >> >> >>> >> >>> I did this because running the test suite was driving me > >> >> bonkers. I > >> >> >>> >> >>> need to ask Dave about how this behaves on Windows (my > guess > >> is > >> >> not > >> >> >>> >> >>> well) but I think we can close things up so that it work= s > >> better > >> >> >>> than > >> >> >>> >> >>> the status quo. > >> >> >>> >> >> > >> >> >>> >> > > >> >> >>> >> > > >> >> >>> >> > > >> >> >>> >> > -- > >> >> >>> >> > Iris Couch > >> >> >>> >> > >> >> >>> > > >> >> >>> > > >> >> >>> > > >> >> >>> > -- > >> >> >>> > NS > >> >> >>> > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> NS > >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > NS > >> >> > >> > > >> > > >> > > >> > -- > >> > NS > >> > > > > > > > > -- > > NS > --=20 NS --047d7ba979780d304604de01a1bc--