couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: IP clearance (Was: Re: [VOTE] Merge BigCouch)
Date Fri, 31 May 2013 10:54:19 GMT
FYI: I've added the first version of couchdb-bigcouch.xml to the
incubator site. I've been unable to publish it and I think I need some
assistance from Noah or Jan to move on.

B.


On 16 May 2013 13:40, Noah Slater <nslater@apache.org> wrote:
> You can take it up to step 5. I have to do step 6 and onwards.
>
>
> On 16 May 2013 13:39, Robert Newson <rnewson@apache.org> wrote:
>
>> nah, I'll do it straight.
>>
>> Can I do this? The docs say Officer or Member.
>>
>> On 16 May 2013 13:37, Noah Slater <nslater@apache.org> wrote:
>> > git help svn
>> >
>> >
>> > On 16 May 2013 13:13, Robert Newson <rnewson@apache.org> wrote:
>> >
>> >> Righto. Now to remember how subversion works...
>> >>
>> >> On 15 May 2013 17:09, Noah Slater <nslater@apache.org> wrote:
>> >> > Okay.
>> >> >
>> >> > Start here:
>> >> >
>> >> > http://incubator.apache.org/ip-clearance/
>> >> >
>> >> > Then make a copy of this file:
>> >> >
>> >> >
>> >>
>> http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/ip-clearance-template.xml
>> >> >
>> >> > This file, when rendered to HTML will look like:
>> >> >
>> >> > http://incubator.apache.org/ip-clearance/ip-clearance-template.html
>> >> >
>> >> > In your local copy, cut everything from:
>> >> >
>> >> >       <pre>-----8-&lt;---- cut here -------8-&lt;----
cut here
>> >> > -------8-&lt;---- cut here-------8-&lt;----</pre>
>> >> >
>> >> > To:
>> >> >
>> >> >       <pre>-----8-&lt;---- cut here -------8-&lt;----
cut here
>> >> > -------8-&lt;---- cut here-------8-&lt;----</pre>
>> >> >
>> >> > Now, add your copy back to Subversion here:
>> >> >
>> >> >
>> >>
>> http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/
>> >> >
>> >> > Call it "couchdb-bigcouch.xml".
>> >> >
>> >> > In a few minutes, this will appear here:
>> >> >
>> >> > http://incubator.apache.org/ip-clearance/couchdb-bigcouch.html
>> >> >
>> >> > Now, it should be a simple matter of going through the doc and
>> completing
>> >> > the checkpoints/sections.
>> >> >
>> >> > Here are the two previous ones we've done:
>> >> >
>> >> > http://incubator.apache.org/ip-clearance/couchdb-docs.html
>> >> >
>> >> > http://incubator.apache.org/ip-clearance/couchdb-fauxton.html
>> >> >
>> >> > Let me know if you get stuck on any of the checkpoints.
>> >> >
>> >> > Once you're done, let me know, and I will use my member karma to push
>> it
>> >> > through the Incubator.
>> >> >
>> >> > Benoit, you may as well start your rcouch stuff at the same time using
>> >> this
>> >> > instructions. Obviously, you should pick "couchdb-rcouch.xml" instead.
>> >> But
>> >> > other than that, it's the same process.
>> >> >
>> >> > On 15 May 2013 16:24, Noah Slater <nslater@apache.org> wrote:
>> >> >
>> >> >> I can help! :)
>> >> >>
>> >> >>
>> >> >> On 15 May 2013 16:23, Robert Newson <rnewson@apache.org>
wrote:
>> >> >>
>> >> >>> :)
>> >> >>>
>> >> >>> Jan, I think you said you'd help start the IP clearance bit?
>> >> >>>
>> >> >>> On 15 May 2013 15:03, Noah Slater <nslater@apache.org>
wrote:
>> >> >>> > PARTY TIME 🎉
>> >> >>> >
>> >> >>> >
>> >> >>> > On 15 May 2013 10:40, Robert Newson <rnewson@apache.org>
wrote:
>> >> >>> >
>> >> >>> >> Thanks everyone.
>> >> >>> >>
>> >> >>> >> The tally is;
>> >> >>> >>
>> >> >>> >> 13 +1's
>> >> >>> >>
>> >> >>> >> The vote passes. We'll now move on to IP clearance.
Once that's
>> done
>> >> >>> >> the work will arrive on a feature branch in our main
git
>> repository.
>> >> >>> >>
>> >> >>> >> B.
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On 13 May 2013 04:31, Jason Smith <jhs@iriscouch.com>
wrote:
>> >> >>> >> > Sorry, just catching up.
>> >> >>> >> >
>> >> >>> >> > +1
>> >> >>> >> >
>> >> >>> >> > On Fri, May 10, 2013 at 4:29 PM, Jan Lehnardt
<jan@apache.org>
>> >> >>> wrote:
>> >> >>> >> >> +1
>> >> >>> >> >>
>> >> >>> >> >> Jan
>> >> >>> >> >> --
>> >> >>> >> >>
>> >> >>> >> >> On May 7, 2013, at 21:34 , Robert Newson
<rnewson@apache.org>
>> >> >>> wrote:
>> >> >>> >> >>
>> >> >>> >> >>> Hi All,
>> >> >>> >> >>>
>> >> >>> >> >>> I propose to merge in the following work,
>> >> >>> >> >>>
>> >> https://github.com/rnewson/couchdb/tree/nebraska-merge-candidateto
>> >> >>> >> >>> the official Apache CouchDB repository
to a new branch (i.e,
>> >> *not*
>> >> >>> >> >>> master). Once there, the full CouchDB
developer community can
>> >> begin
>> >> >>> >> >>> the work to incorporate the code here
into an official
>> release.
>> >> >>> >> >>>
>> >> >>> >> >>> You do not need to respond if you are
in agreement. If there
>> is
>> >> no
>> >> >>> >> >>> response in 72 hours, I will assume lazy
consensus. If we
>> reach
>> >> >>> >> >>> consensus, I will start the IP clearance
process and then the
>> >> >>> merge.
>> >> >>> >> >>>
>> >> >>> >> >>> As most of you know, Paul Davis and I
recently sequestered
>> >> >>> ourselves
>> >> >>> >> >>> away from society (in a place called
Nebraska) to make this
>> >> merge
>> >> >>> >> >>> happen. I want to clarify that this work
is not the BigCouch
>> >> code
>> >> >>> you
>> >> >>> >> >>> can see on github.com/cloudant/bigcouch
but the Cloudant
>> >> platform
>> >> >>> from
>> >> >>> >> >>> which BigCouch was made. This means it
is bang up to date
>> with
>> >> all
>> >> >>> the
>> >> >>> >> >>> bug fixes and feature enhancements we've
made in the last
>> >> eighteen
>> >> >>> >> >>> months or more. With that clarification
made, here are our
>> notes
>> >> >>> about
>> >> >>> >> >>> what we achieved, what it means to the
project and what isn't
>> >> yet
>> >> >>> >> >>> done;
>> >> >>> >> >>>
>> >> >>> >> >>> Nebraska Merge Roundup
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Stats:
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> 1402 - total new commits
>> >> >>> >> >>>
>> >> >>> >> >>> 312 - commits written during the merge
(will be reduced
>> >> >>> substantially
>> >> >>> >> >>> by squashing)
>> >> >>> >> >>>
>> >> >>> >> >>> 408 - number of files changed
>> >> >>> >> >>>
>> >> >>> >> >>> 21,897 - number of lines added
>> >> >>> >> >>>
>> >> >>> >> >>> 4,277 - number of lines removed
>> >> >>> >> >>>
>> >> >>> >> >>> A retrospective:
>> >> >>> >> >>>
>> >> >>> >> >>> Bob Newson and I have come to the end
of our merge sprint on
>> >> >>> getting
>> >> >>> >> >>> BigCouch merged into Apache CouchDB.
Its been a productive
>> ten
>> >> days
>> >> >>> >> >>> here in the midwest. I managed to get
Bob out to a bowling
>> alley
>> >> >>> and
>> >> >>> >> >>> he managed to get me to a sushi restaurant.
In between the
>> >> cultural
>> >> >>> >> >>> exchanges we’ve also managed to get
a significant amount of
>> work
>> >> >>> done
>> >> >>> >> >>> on the merging as well.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> The current status of the merge is that
we’ve managed to
>> resolve
>> >> >>> the
>> >> >>> >> >>> differences in the single node execution
of CouchDB. Both the
>> >> >>> >> >>> JavaScript and Erlang test suites run
with only one failure
>> in
>> >> the
>> >> >>> >> >>> Erlang test suite due to a (deliberately)
missing constraint
>> on
>> >> the
>> >> >>> >> >>> number of operating system processes.
This should be a
>> >> relatively
>> >> >>> >> >>> straightforward fix but was not prioritized
during our
>> limited
>> >> >>> time to
>> >> >>> >> >>> work on the larger issues.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We merged a large number of performance
and stability
>> >> enhancements
>> >> >>> >> >>> back into single node CouchDB as well
as a number of pure bug
>> >> >>> fixes.
>> >> >>> >> >>> The biggest highlight is a brand new
compactor that is both
>> >> faster
>> >> >>> and
>> >> >>> >> >>> creates smaller and better organized
post-compaction
>> databases.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> The current status of the merge is that
single node
>> operations
>> >> >>> should
>> >> >>> >> >>> be completely unaffected as demonstrated
by the test suite
>> >> >>> passing. On
>> >> >>> >> >>> the other hand we haven’t yet finished
getting the clustered
>> >> code
>> >> >>> >> >>> merged to use some of the new changes
in single node CouchDB.
>> >> The
>> >> >>> >> >>> single most significant portion of this
work involves
>> updates to
>> >> >>> the
>> >> >>> >> >>> internal cluster API for views to use
the recently rewritten
>> >> >>> indexer
>> >> >>> >> >>> APIs. This should be a relatively straightforward
bit of work
>> >> that
>> >> >>> >> >>> we’ll be finishing over the next few
weeks.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> All in all the merge work done so far
has been quite
>> successful.
>> >> >>> We’ve
>> >> >>> >> >>> met our primary goal of getting the code
merged in a fashion
>> >> that
>> >> >>> does
>> >> >>> >> >>> not affect single node operation while
providing a starting
>> >> point
>> >> >>> for
>> >> >>> >> >>> the larger community to start reviewing
the more significant
>> >> >>> changes
>> >> >>> >> >>> made. Given the size of the diff between
the two code bases
>> we
>> >> >>> never
>> >> >>> >> >>> expected to have a fully working clustered
solution after ten
>> >> days
>> >> >>> of
>> >> >>> >> >>> work but we have succeeded in providing
a base of work that
>> will
>> >> >>> allow
>> >> >>> >> >>> us and new contributors to get up to
speed quickly.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> This work, coupled with work by Dave
Cottlehuber and Benoît
>> >> >>> Chesneau
>> >> >>> >> >>> on updating the build system and various
other internal
>> updates,
>> >> >>> will
>> >> >>> >> >>> provide a solid foundation for work going
forward. Its an
>> >> exciting
>> >> >>> >> >>> time for CouchDB and anyone interested
should keep an eye on
>> the
>> >> >>> next
>> >> >>> >> >>> few releases as we ramp up work on various
core aspects of
>> the
>> >> >>> >> >>> database.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We’ve had an exciting few days working
to prepare the road
>> for
>> >> an
>> >> >>> >> >>> exciting next twelve to eighteen months.
We hope that
>> everyone
>> >> will
>> >> >>> >> >>> feel as excited as we do about the next
twelve to eighteen
>> >> months
>> >> >>> for
>> >> >>> >> >>> Apache CouchDB. It should be an exciting
ride.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Things we got done
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> * Large update to the source tree layout
for Erlang
>> >> applications.
>> >> >>> Each
>> >> >>> >> >>> application now has a src/appname/(c_src|ebin|priv|src)
>> >> structure.
>> >> >>> The
>> >> >>> >> >>> build system has been updated.
>> >> >>> >> >>>
>> >> >>> >> >>> * Renamed src/couchdb to src/couch to
match the Erlang
>> >> convention
>> >> >>> of
>> >> >>> >> >>> the top directory name matching the Erlang
application name.
>> >> >>> >> >>>
>> >> >>> >> >>> * Imported Cloudant Erlang applications
for clustered
>> CouchDB.
>> >> >>> These
>> >> >>> >> >>> are imported with their history by using
git subtree and
>> merging
>> >> >>> the
>> >> >>> >> >>> top level commit. These are not external
deps, development
>> will
>> >> >>> happen
>> >> >>> >> >>> within the CouchDB tree. The imported
apps are:
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>>   * config - A couch_config replacement
(Behavior is mostly
>> >> >>> identical
>> >> >>> >> >>> to couch_config except how we listen
for configuration
>> changes
>> >> >>> >> >>> internally to allow for smooth hot code
upgrade).
>> >> >>> >> >>>
>> >> >>> >> >>>   * twig - An rsyslog source replacement
for couch_log.
>> >> >>> >> >>>
>> >> >>> >> >>>   * rexi - An RPC library. Replaces Erlang’s
built-in rex
>> >> >>> application
>> >> >>> >> >>> to avoid costly safety measures in the
interest of
>> performance
>> >> and
>> >> >>> >> >>> throughput.
>> >> >>> >> >>>
>> >> >>> >> >>>   * mem3 - The “Dynamo” part of BigCouch
responsible for
>> >> managing
>> >> >>> >> cluster state
>> >> >>> >> >>>
>> >> >>> >> >>>   * fabric - The internal cluster-aware
CouachDB API
>> >> >>> >> >>>
>> >> >>> >> >>>   * ets_lru - A small library application
that provides an
>> LRU
>> >> >>> >> >>> implementation using a couple ets tables.
>> >> >>> >> >>>
>> >> >>> >> >>>   * ddoc_cache - Caches design documents
on each node for
>> use in
>> >> >>> >> >>> design handler functions. This uses an
ets_lru cache with a
>> very
>> >> >>> short
>> >> >>> >> >>> TTL.
>> >> >>> >> >>>
>> >> >>> >> >>>   * chttpd - The cluster aware HTTP layer
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Each imported app also had its build
system updated to use
>> >> >>> Autotools
>> >> >>> >> >>> along with the necessary updates noted
above for the new
>> >> >>> application
>> >> >>> >> >>> layouts for existing CouchDB erlang apps.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> * Merged a large amount of updates and
fixes to
>> couch_replicator
>> >> >>> based
>> >> >>> >> >>> on work done internally at Cloudant.
Unfortunately due to an
>> >> error
>> >> >>> >> >>> when we created our internal clone we
lost a bit of history
>> in
>> >> >>> some of
>> >> >>> >> >>> the initial merge and have a big commit
that affects
>> >> >>> >> >>> couch_replicator_manager mostly. There
are a number of other
>> >> >>> commits
>> >> >>> >> >>> related to couch_replicator that resolve
the single node vs.
>> >> >>> clustered
>> >> >>> >> >>> differences. Some noticeable couch_replicator
features:
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>>   * Optionally disable checkpoints so
that replication can
>> work
>> >> >>> when
>> >> >>> >> >>> a source is read only. This should only
be used for smaller
>> >> >>> databases
>> >> >>> >> >>> as each replication call has to scan
the entire source
>> database
>> >> on
>> >> >>> >> >>> each invocation.
>> >> >>> >> >>>
>> >> >>> >> >>>   * A new changes_pending field in the
_active_tasks output
>> >> >>> >> >>>
>> >> >>> >> >>>   * A fix to the continuous replication
to automatically
>> >> reconnect
>> >> >>> to
>> >> >>> >> >>> a continuous changes feed when it sees
a last_seq value. This
>> >> >>> allows
>> >> >>> >> >>> for the source to selectively recycle
the HTTP connections
>> used
>> >> >>> which
>> >> >>> >> >>> can be quite useful for “permanent”
replications.
>> >> >>> >> >>>
>> >> >>> >> >>>   * A multitude of smaller bug fix and
stability
>> enhancements.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Updates to single node couch:
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> * We changed the by_seq tree to store
a copy of the
>> >> >>> #full_doc_info{}
>> >> >>> >> >>> record instead of the #doc_info{} record.
This gives
>> significant
>> >> >>> speed
>> >> >>> >> >>> improvements for compaction and replication
and generally
>> >> anything
>> >> >>> >> >>> that needs to walk the by_seq tree and
access document bodies
>> >> >>> >> >>> internally.
>> >> >>> >> >>>
>> >> >>> >> >>> * We rewrote the compactor to be significantly
faster as
>> well as
>> >> >>> >> >>> provides significantly better compacted
databases. The two
>> main
>> >> >>> halves
>> >> >>> >> >>> are to use a temp file and replace the
use of btrees in the
>> temp
>> >> >>> file.
>> >> >>> >> >>> The temp file only contains a temporary
copy of the document
>> >> ids.
>> >> >>> At
>> >> >>> >> >>> the end of a compaction run we then rebuild
the by_id btree
>> in
>> >> the
>> >> >>> >> >>> compaction file from this temp file.
The reason this helps so
>> >> much
>> >> >>> is
>> >> >>> >> >>> that the compaction is based on the update_seq
btree, which
>> for
>> >> >>> most
>> >> >>> >> >>> cases means that the id tree is updated
in roughly random
>> order
>> >> >>> which
>> >> >>> >> >>> is very bad for our append only btrees.
By using the tmp
>> file we
>> >> >>> can
>> >> >>> >> >>> stream it in order back into the compacted
db file at the
>> end of
>> >> >>> >> >>> compacting, generating a minimum amount
of garbage in the
>> >> process.
>> >> >>> The
>> >> >>> >> >>> other upgrade was to implement an external
merge sort module
>> >> >>> >> >>> (couch_emsort) that is used with this
temporary file.
>> >> >>> >> >>>
>> >> >>> >> >>> * Reject updates to design docs that
introduce updates that
>> >> break
>> >> >>> >> >>> compilation for source code. Currently
we only check map and
>> >> reduce
>> >> >>> >> >>> calls as the other should provide user
visible errors
>> instead of
>> >> >>> >> >>> inexplicably empty views.
>> >> >>> >> >>>
>> >> >>> >> >>> because my OCD kicked in and I was unable
to resist.
>> >> >>> >> >>>
>> >> >>> >> >>> * Reverted a change made a long time
ago that uses two file
>> >> >>> >> >>> descriptors for each database. See the
todo list.
>> >> >>> >> >>>
>> >> >>> >> >>> * The reason to remove the second fd
is so that we can
>> rewrite
>> >> ref
>> >> >>> >> >>> counting. Better ref counting makes everyone
happy, but the
>> real
>> >> >>> >> >>> reason is for this next bullet point:
>> >> >>> >> >>>
>> >> >>> >> >>> * Optimize couch_server to not require
a round trip message
>> pass
>> >> >>> for
>> >> >>> >> >>> opening a database that’s in the LRU.
This is a significant
>> >> >>> >> >>> performance boost for high concurrency
access. We also
>> optimized
>> >> >>> >> >>> couch_server internals to not blow up
when it’s under load.
>> >> >>> >> >>>
>> >> >>> >> >>> * Introduce a #leaf{} record into the
revision trees. This is
>> >> never
>> >> >>> >> >>> written to disk but makes internal code
a lot cleaner when
>> >> dealing
>> >> >>> >> >>> with multiple versions of rev tree values.
>> >> >>> >> >>>
>> >> >>> >> >>> * Some changes to couch_changes to enable
clustered access.
>> Also
>> >> >>> some
>> >> >>> >> >>> general cleanup
>> >> >>> >> >>>
>> >> >>> >> >>> * Internal changes to how CouchDB is
booted in Erlang land.
>> Not
>> >> >>> very
>> >> >>> >> >>> sexy but this removes a lot of complicated
un-Erlangy bits.
>> We
>> >> >>> still
>> >> >>> >> >>> have a bit of work left here.
>> >> >>> >> >>>
>> >> >>> >> >>> * btree chunk sizes are now configurable
which can allow
>> people
>> >> to
>> >> >>> >> >>> adjust the RAM/speed tradeoffs a bit
more.
>> >> >>> >> >>>
>> >> >>> >> >>> * We now load update validation functions
on the first write.
>> >> This
>> >> >>> is
>> >> >>> >> >>> a cluster-motivated change because the
clustered version of
>> this
>> >> >>> call
>> >> >>> >> >>> is expensive and can lead to race conditions
when opening a
>> >> bunch
>> >> >>> of
>> >> >>> >> >>> db shards simultaneously. This should
be invisible to
>> external
>> >> >>> >> >>> clients.
>> >> >>> >> >>>
>> >> >>> >> >>> * Disabled conflict detection for local
docs. They don’t
>> >> replicate
>> >> >>> so
>> >> >>> >> >>> there’s no point. This just led to
clusters getting stuck and
>> >> >>> confused
>> >> >>> >> >>> when there were lots of replications
happening.
>> >> >>> >> >>>
>> >> >>> >> >>> * Changes to the multipart/mime parsing
code. Necessary for
>> >> >>> clustered
>> >> >>> >> >>> attachment uploads to split the incoming
data  stream into N
>> >> >>> copies.
>> >> >>> >> >>>
>> >> >>> >> >>> * Don’t use init:restart/0 when reloading
the ICU driver. I
>> >> think
>> >> >>> >> >>> this has a bug. But we should rewrite
this driver to be a NIF
>> >> >>> anyway.
>> >> >>> >> >>>
>> >> >>> >> >>> * New couch OS process manager. Significantly
faster access
>> to
>> >> OS
>> >> >>> >> >>> processes under heavy load. This replaces
the hard limit
>> with a
>> >> >>> soft
>> >> >>> >> >>> limit. Process spawned over the soft
limit will be used until
>> >> >>> they’ve
>> >> >>> >> >>> sat idle for a few minutes and then be
closed. We have a todo
>> >> item
>> >> >>> to
>> >> >>> >> >>> add the hard ceiling back in (while keeping
the soft
>> ceiling).
>> >> >>> >> >>>
>> >> >>> >> >>> * Automatically replace some easily identifiable
JS
>> reductions
>> >> with
>> >> >>> >> >>> their builtin counterparts. Uses a regex
to do the detection
>> so
>> >> its
>> >> >>> >> >>> not too smart.
>> >> >>> >> >>>
>> >> >>> >> >>> * Improved view updater write batch.
>> >> >>> >> >>>
>> >> >>> >> >>> * Updates to couchjs’ views.js to improve
index update speeds
>> >> >>> >> >>>
>> >> >>> >> >>> * Updates to the _stats bultin reduce
to allow reduces to
>> work
>> >> over
>> >> >>> >> >>> emitted stats objects. Sometimes clients
have summary data
>> in a
>> >> >>> doc,
>> >> >>> >> >>> and this allows them to combine stats
if they follow the same
>> >> >>> pattern
>> >> >>> >> >>> as the builtin expects.
>> >> >>> >> >>>
>> >> >>> >> >>> * Added a config:reload() that is accessible
by POST’ing to
>> >> >>> >> >>> _config/_reload. Used by the JS tests
to reset the config to
>> >> >>> what's on
>> >> >>> >> >>> disk. This should prevent those test
run failures where a
>> test
>> >> >>> fails
>> >> >>> >> >>> leaving the config in a bad state causing
all subsequent
>> tests
>> >> to
>> >> >>> >> >>> fail. I think. Maybe.
>> >> >>> >> >>>
>> >> >>> >> >>> * Databases are deleted synchronously
in the test suite. We
>> may
>> >> >>> need
>> >> >>> >> >>> to address this on Windows. But it does
seem to reduce the
>> >> number
>> >> >>> of
>> >> >>> >> >>> “{error, file_exists}” failures.
>> >> >>> >> >>>
>> >> >>> >> >>> * I reimplemented the JS restartServer()
function. There’s a
>> new
>> >> >>> >> >>> _restart/token URL that will given a
unique value for each
>> >> >>> instance of
>> >> >>> >> >>> the Erlang VM. To run a restart we grab
the current token
>> value,
>> >> >>> hit
>> >> >>> >> >>> _restart, then wait till we get a successful
response with a
>> >> >>> different
>> >> >>> >> >>> token. This appears to have made the
restart strategy more
>> >> robust.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Things that need doing
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> IP Clearance -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We’ll need to track down if we have
the CCLA as well as look
>> at
>> >> >>> each
>> >> >>> >> >>> source file added to make sure each one
is strictly from
>> >> Cloudant
>> >> >>> or
>> >> >>> >> >>> has an amenable license. I’m pretty
sure that the only one of
>> >> >>> interest
>> >> >>> >> >>> is trunc_io.erl but we need to be thorough.
>> >> >>> >> >>>
>> >> >>> >> >>> documentation -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> There shouldn’t be much here since
the entire point of this
>> >> merge
>> >> >>> was
>> >> >>> >> >>> to not change the visible behavior of
single node couch. A
>> few
>> >> >>> things
>> >> >>> >> >>> to add about the testing endpoints. Maybe
an update to the
>> >> >>> compaction
>> >> >>> >> >>> section mention the two new file names
used.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Copyright notices -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We need to strip out copyright notices
from individual files
>> and
>> >> >>> make
>> >> >>> >> >>> sure all files have a standard Apache
License v2 header.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> clustered vhosts -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We’ve never implemented this at Cloudant.
We either need to
>> >> write a
>> >> >>> >> >>> cluster or go back and tell people to
use HAProxy (or
>> similar)
>> >> for
>> >> >>> >> >>> such things.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> twig -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We need to add another output type to
twig that is
>> configurable
>> >> in
>> >> >>> >> >>> some manner. Right now we spit out entire
rsyslog records
>> which
>> >> >>> isn’t
>> >> >>> >> >>> useful for most people. We’ll need
to implement the file
>> writer
>> >> >>> from
>> >> >>> >> >>> couch_log as well as update the _log
HTTP handler to know
>> when
>> >> it
>> >> >>> can
>> >> >>> >> >>> and can’t expect to find data on disk.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> fabric -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> This is going to need a lot of work.
Specifically view
>> access is
>> >> >>> going
>> >> >>> >> >>> to need to be updated to work with couch_mrview
and friends.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Boot a dev cluster -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Once we fix up the clustering code we’ll
need to write
>> >> instructions
>> >> >>> >> >>> and scripts for pulling up a dev cluster.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> OTP stuff -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We’ve updated each app but we still
need to pull some parts
>> out
>> >> of
>> >> >>> >> >>> couchdb into their own application. Specifically
the HTTP
>> layer
>> >> >>> needs
>> >> >>> >> >>> its own app. We could probably pull out
the os
>> >> >>> process/query_servers
>> >> >>> >> >>> as well as the os daemons and friends.
Once done we need to
>> >> update
>> >> >>> the
>> >> >>> >> >>> supervision trees so we don’t have
things like couch starting
>> >> and
>> >> >>> >> >>> managing the replication manager process.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> ddoc_cache -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Wire this up in couch_httpd_db to actually
be used. Right now
>> >> its
>> >> >>> only
>> >> >>> >> >>> used in chttpd.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> couch_file upgrade -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> The revert to remove the second updater_fd
from each #db{}
>> >> record
>> >> >>> >> >>> means that we’re back in the original
position of files
>> >> appearing
>> >> >>> to
>> >> >>> >> >>> slow down significantly under load. Since
the initial hammer
>> >> >>> approach
>> >> >>> >> >>> of just adding a second fd we’ve since
discovered that the
>> >> >>> underlying
>> >> >>> >> >>> bug is due to the way that message passing
works combined
>> with
>> >> >>> >> >>> Erlang’s file io. Significantly though
is the fact that the
>> fix
>> >> is
>> >> >>> >> >>> rather simple to implement. A first draft
of this work is on
>> an
>> >> old
>> >> >>> >> >>> branch of mine here:
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>>   https://github.com/davisp/couchdb/commit/d856878
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> finish the size calculating changes -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> The #leaf{} record change is to enable
us to add more data
>> size
>> >> >>> >> >>> calculations. CouchDB master calculates
a data size that
>> account
>> >> >>> for
>> >> >>> >> >>> all bytes that are active in a .couch
file. Cloudant is
>> >> interested
>> >> >>> in
>> >> >>> >> >>> the total size of uncompressed docs and
attachments minus the
>> >> >>> internal
>> >> >>> >> >>> overhead of btrees. And there’s a fourth
number to calculate
>> >> based
>> >> >>> on
>> >> >>> >> >>> the compression level used. Having each
of these numbers
>> will be
>> >> >>> >> >>> useful as well as the calculations they’ll
enable (ie, dead
>> >> bytes
>> >> >>> in
>> >> >>> >> >>> file, bytes used for overhead, compression
ratio achieved,
>> etc).
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> couch_proc_manager -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> We need to implement the hard ceiling
for capping the number
>> of
>> >> OS
>> >> >>> >> >>> processes. We’ve started seeing a need
for this at Cloudant
>> with
>> >> >>> some
>> >> >>> >> >>> work loads so motivation to fix this
is high. The only
>> failing
>> >> >>> etap is
>> >> >>> >> >>> the assertion of this ceiling.
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> Synchronous db delete on Windows -
>> >> >>> >> >>>
>> >> >>> >> >>>
>> >> >>> >> >>> I did this because running the test suite
was driving me
>> >> bonkers. I
>> >> >>> >> >>> need to ask Dave about how this behaves
on Windows (my guess
>> is
>> >> not
>> >> >>> >> >>> well) but I think we can close things
up so that it works
>> better
>> >> >>> than
>> >> >>> >> >>> the status quo.
>> >> >>> >> >>
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > --
>> >> >>> >> > Iris Couch
>> >> >>> >>
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> > --
>> >> >>> > NS
>> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> NS
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > NS
>> >>
>> >
>> >
>> >
>> > --
>> > NS
>>
>
>
>
> --
> NS

Mime
View raw message