couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noah Slater <nsla...@apache.org>
Subject Re: IP clearance (Was: Re: [VOTE] Merge BigCouch)
Date Thu, 16 May 2013 12:37:37 GMT
git help svn


On 16 May 2013 13:13, Robert Newson <rnewson@apache.org> wrote:

> Righto. Now to remember how subversion works...
>
> On 15 May 2013 17:09, Noah Slater <nslater@apache.org> wrote:
> > Okay.
> >
> > Start here:
> >
> > http://incubator.apache.org/ip-clearance/
> >
> > Then make a copy of this file:
> >
> >
> http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/ip-clearance-template.xml
> >
> > This file, when rendered to HTML will look like:
> >
> > http://incubator.apache.org/ip-clearance/ip-clearance-template.html
> >
> > In your local copy, cut everything from:
> >
> >       <pre>-----8-&lt;---- cut here -------8-&lt;---- cut here
> > -------8-&lt;---- cut here-------8-&lt;----</pre>
> >
> > To:
> >
> >       <pre>-----8-&lt;---- cut here -------8-&lt;---- cut here
> > -------8-&lt;---- cut here-------8-&lt;----</pre>
> >
> > Now, add your copy back to Subversion here:
> >
> >
> http://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/
> >
> > Call it "couchdb-bigcouch.xml".
> >
> > In a few minutes, this will appear here:
> >
> > http://incubator.apache.org/ip-clearance/couchdb-bigcouch.html
> >
> > Now, it should be a simple matter of going through the doc and completing
> > the checkpoints/sections.
> >
> > Here are the two previous ones we've done:
> >
> > http://incubator.apache.org/ip-clearance/couchdb-docs.html
> >
> > http://incubator.apache.org/ip-clearance/couchdb-fauxton.html
> >
> > Let me know if you get stuck on any of the checkpoints.
> >
> > Once you're done, let me know, and I will use my member karma to push it
> > through the Incubator.
> >
> > Benoit, you may as well start your rcouch stuff at the same time using
> this
> > instructions. Obviously, you should pick "couchdb-rcouch.xml" instead.
> But
> > other than that, it's the same process.
> >
> > On 15 May 2013 16:24, Noah Slater <nslater@apache.org> wrote:
> >
> >> I can help! :)
> >>
> >>
> >> On 15 May 2013 16:23, Robert Newson <rnewson@apache.org> wrote:
> >>
> >>> :)
> >>>
> >>> Jan, I think you said you'd help start the IP clearance bit?
> >>>
> >>> On 15 May 2013 15:03, Noah Slater <nslater@apache.org> wrote:
> >>> > PARTY TIME 🎉
> >>> >
> >>> >
> >>> > On 15 May 2013 10:40, Robert Newson <rnewson@apache.org> wrote:
> >>> >
> >>> >> Thanks everyone.
> >>> >>
> >>> >> The tally is;
> >>> >>
> >>> >> 13 +1's
> >>> >>
> >>> >> The vote passes. We'll now move on to IP clearance. Once that's
done
> >>> >> the work will arrive on a feature branch in our main git repository.
> >>> >>
> >>> >> B.
> >>> >>
> >>> >>
> >>> >> On 13 May 2013 04:31, Jason Smith <jhs@iriscouch.com> wrote:
> >>> >> > Sorry, just catching up.
> >>> >> >
> >>> >> > +1
> >>> >> >
> >>> >> > On Fri, May 10, 2013 at 4:29 PM, Jan Lehnardt <jan@apache.org>
> >>> wrote:
> >>> >> >> +1
> >>> >> >>
> >>> >> >> Jan
> >>> >> >> --
> >>> >> >>
> >>> >> >> On May 7, 2013, at 21:34 , Robert Newson <rnewson@apache.org>
> >>> wrote:
> >>> >> >>
> >>> >> >>> Hi All,
> >>> >> >>>
> >>> >> >>> I propose to merge in the following work,
> >>> >> >>>
> https://github.com/rnewson/couchdb/tree/nebraska-merge-candidateto
> >>> >> >>> the official Apache CouchDB repository to a new branch
(i.e,
> *not*
> >>> >> >>> master). Once there, the full CouchDB developer community
can
> begin
> >>> >> >>> the work to incorporate the code here into an official
release.
> >>> >> >>>
> >>> >> >>> You do not need to respond if you are in agreement.
If there is
> no
> >>> >> >>> response in 72 hours, I will assume lazy consensus.
If we reach
> >>> >> >>> consensus, I will start the IP clearance process and
then the
> >>> merge.
> >>> >> >>>
> >>> >> >>> As most of you know, Paul Davis and I recently sequestered
> >>> ourselves
> >>> >> >>> away from society (in a place called Nebraska) to
make this
> merge
> >>> >> >>> happen. I want to clarify that this work is not the
BigCouch
> code
> >>> you
> >>> >> >>> can see on github.com/cloudant/bigcouch but the Cloudant
> platform
> >>> from
> >>> >> >>> which BigCouch was made. This means it is bang up
to date with
> all
> >>> the
> >>> >> >>> bug fixes and feature enhancements we've made in the
last
> eighteen
> >>> >> >>> months or more. With that clarification made, here
are our notes
> >>> about
> >>> >> >>> what we achieved, what it means to the project and
what isn't
> yet
> >>> >> >>> done;
> >>> >> >>>
> >>> >> >>> Nebraska Merge Roundup
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Stats:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> 1402 - total new commits
> >>> >> >>>
> >>> >> >>> 312 - commits written during the merge (will be reduced
> >>> substantially
> >>> >> >>> by squashing)
> >>> >> >>>
> >>> >> >>> 408 - number of files changed
> >>> >> >>>
> >>> >> >>> 21,897 - number of lines added
> >>> >> >>>
> >>> >> >>> 4,277 - number of lines removed
> >>> >> >>>
> >>> >> >>> A retrospective:
> >>> >> >>>
> >>> >> >>> Bob Newson and I have come to the end of our merge
sprint on
> >>> getting
> >>> >> >>> BigCouch merged into Apache CouchDB. Its been a productive
ten
> days
> >>> >> >>> here in the midwest. I managed to get Bob out to a
bowling alley
> >>> and
> >>> >> >>> he managed to get me to a sushi restaurant. In between
the
> cultural
> >>> >> >>> exchanges we’ve also managed to get a significant
amount of work
> >>> done
> >>> >> >>> on the merging as well.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> The current status of the merge is that we’ve managed
to resolve
> >>> the
> >>> >> >>> differences in the single node execution of CouchDB.
Both the
> >>> >> >>> JavaScript and Erlang test suites run with only one
failure in
> the
> >>> >> >>> Erlang test suite due to a (deliberately) missing
constraint on
> the
> >>> >> >>> number of operating system processes. This should
be a
> relatively
> >>> >> >>> straightforward fix but was not prioritized during
our limited
> >>> time to
> >>> >> >>> work on the larger issues.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We merged a large number of performance and stability
> enhancements
> >>> >> >>> back into single node CouchDB as well as a number
of pure bug
> >>> fixes.
> >>> >> >>> The biggest highlight is a brand new compactor that
is both
> faster
> >>> and
> >>> >> >>> creates smaller and better organized post-compaction
databases.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> The current status of the merge is that single node
operations
> >>> should
> >>> >> >>> be completely unaffected as demonstrated by the test
suite
> >>> passing. On
> >>> >> >>> the other hand we haven’t yet finished getting the
clustered
> code
> >>> >> >>> merged to use some of the new changes in single node
CouchDB.
> The
> >>> >> >>> single most significant portion of this work involves
updates to
> >>> the
> >>> >> >>> internal cluster API for views to use the recently
rewritten
> >>> indexer
> >>> >> >>> APIs. This should be a relatively straightforward
bit of work
> that
> >>> >> >>> we’ll be finishing over the next few weeks.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> All in all the merge work done so far has been quite
successful.
> >>> We’ve
> >>> >> >>> met our primary goal of getting the code merged in
a fashion
> that
> >>> does
> >>> >> >>> not affect single node operation while providing a
starting
> point
> >>> for
> >>> >> >>> the larger community to start reviewing the more significant
> >>> changes
> >>> >> >>> made. Given the size of the diff between the two code
bases we
> >>> never
> >>> >> >>> expected to have a fully working clustered solution
after ten
> days
> >>> of
> >>> >> >>> work but we have succeeded in providing a base of
work that will
> >>> allow
> >>> >> >>> us and new contributors to get up to speed quickly.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> This work, coupled with work by Dave Cottlehuber and
Benoît
> >>> Chesneau
> >>> >> >>> on updating the build system and various other internal
updates,
> >>> will
> >>> >> >>> provide a solid foundation for work going forward.
Its an
> exciting
> >>> >> >>> time for CouchDB and anyone interested should keep
an eye on the
> >>> next
> >>> >> >>> few releases as we ramp up work on various core aspects
of the
> >>> >> >>> database.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We’ve had an exciting few days working to prepare
the road for
> an
> >>> >> >>> exciting next twelve to eighteen months. We hope that
everyone
> will
> >>> >> >>> feel as excited as we do about the next twelve to
eighteen
> months
> >>> for
> >>> >> >>> Apache CouchDB. It should be an exciting ride.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Things we got done
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> * Large update to the source tree layout for Erlang
> applications.
> >>> Each
> >>> >> >>> application now has a src/appname/(c_src|ebin|priv|src)
> structure.
> >>> The
> >>> >> >>> build system has been updated.
> >>> >> >>>
> >>> >> >>> * Renamed src/couchdb to src/couch to match the Erlang
> convention
> >>> of
> >>> >> >>> the top directory name matching the Erlang application
name.
> >>> >> >>>
> >>> >> >>> * Imported Cloudant Erlang applications for clustered
CouchDB.
> >>> These
> >>> >> >>> are imported with their history by using git subtree
and merging
> >>> the
> >>> >> >>> top level commit. These are not external deps, development
will
> >>> happen
> >>> >> >>> within the CouchDB tree. The imported apps are:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>   * config - A couch_config replacement (Behavior
is mostly
> >>> identical
> >>> >> >>> to couch_config except how we listen for configuration
changes
> >>> >> >>> internally to allow for smooth hot code upgrade).
> >>> >> >>>
> >>> >> >>>   * twig - An rsyslog source replacement for couch_log.
> >>> >> >>>
> >>> >> >>>   * rexi - An RPC library. Replaces Erlang’s built-in
rex
> >>> application
> >>> >> >>> to avoid costly safety measures in the interest of
performance
> and
> >>> >> >>> throughput.
> >>> >> >>>
> >>> >> >>>   * mem3 - The “Dynamo” part of BigCouch responsible
for
> managing
> >>> >> cluster state
> >>> >> >>>
> >>> >> >>>   * fabric - The internal cluster-aware CouachDB API
> >>> >> >>>
> >>> >> >>>   * ets_lru - A small library application that provides
an LRU
> >>> >> >>> implementation using a couple ets tables.
> >>> >> >>>
> >>> >> >>>   * ddoc_cache - Caches design documents on each node
for use in
> >>> >> >>> design handler functions. This uses an ets_lru cache
with a very
> >>> short
> >>> >> >>> TTL.
> >>> >> >>>
> >>> >> >>>   * chttpd - The cluster aware HTTP layer
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Each imported app also had its build system updated
to use
> >>> Autotools
> >>> >> >>> along with the necessary updates noted above for the
new
> >>> application
> >>> >> >>> layouts for existing CouchDB erlang apps.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> * Merged a large amount of updates and fixes to couch_replicator
> >>> based
> >>> >> >>> on work done internally at Cloudant. Unfortunately
due to an
> error
> >>> >> >>> when we created our internal clone we lost a bit of
history in
> >>> some of
> >>> >> >>> the initial merge and have a big commit that affects
> >>> >> >>> couch_replicator_manager mostly. There are a number
of other
> >>> commits
> >>> >> >>> related to couch_replicator that resolve the single
node vs.
> >>> clustered
> >>> >> >>> differences. Some noticeable couch_replicator features:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>   * Optionally disable checkpoints so that replication
can work
> >>> when
> >>> >> >>> a source is read only. This should only be used for
smaller
> >>> databases
> >>> >> >>> as each replication call has to scan the entire source
database
> on
> >>> >> >>> each invocation.
> >>> >> >>>
> >>> >> >>>   * A new changes_pending field in the _active_tasks
output
> >>> >> >>>
> >>> >> >>>   * A fix to the continuous replication to automatically
> reconnect
> >>> to
> >>> >> >>> a continuous changes feed when it sees a last_seq
value. This
> >>> allows
> >>> >> >>> for the source to selectively recycle the HTTP connections
used
> >>> which
> >>> >> >>> can be quite useful for “permanent” replications.
> >>> >> >>>
> >>> >> >>>   * A multitude of smaller bug fix and stability enhancements.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Updates to single node couch:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> * We changed the by_seq tree to store a copy of the
> >>> #full_doc_info{}
> >>> >> >>> record instead of the #doc_info{} record. This gives
significant
> >>> speed
> >>> >> >>> improvements for compaction and replication and generally
> anything
> >>> >> >>> that needs to walk the by_seq tree and access document
bodies
> >>> >> >>> internally.
> >>> >> >>>
> >>> >> >>> * We rewrote the compactor to be significantly faster
as well as
> >>> >> >>> provides significantly better compacted databases.
The two main
> >>> halves
> >>> >> >>> are to use a temp file and replace the use of btrees
in the temp
> >>> file.
> >>> >> >>> The temp file only contains a temporary copy of the
document
> ids.
> >>> At
> >>> >> >>> the end of a compaction run we then rebuild the by_id
btree in
> the
> >>> >> >>> compaction file from this temp file. The reason this
helps so
> much
> >>> is
> >>> >> >>> that the compaction is based on the update_seq btree,
which for
> >>> most
> >>> >> >>> cases means that the id tree is updated in roughly
random order
> >>> which
> >>> >> >>> is very bad for our append only btrees. By using the
tmp file we
> >>> can
> >>> >> >>> stream it in order back into the compacted db file
at the end of
> >>> >> >>> compacting, generating a minimum amount of garbage
in the
> process.
> >>> The
> >>> >> >>> other upgrade was to implement an external merge sort
module
> >>> >> >>> (couch_emsort) that is used with this temporary file.
> >>> >> >>>
> >>> >> >>> * Reject updates to design docs that introduce updates
that
> break
> >>> >> >>> compilation for source code. Currently we only check
map and
> reduce
> >>> >> >>> calls as the other should provide user visible errors
instead of
> >>> >> >>> inexplicably empty views.
> >>> >> >>>
> >>> >> >>> because my OCD kicked in and I was unable to resist.
> >>> >> >>>
> >>> >> >>> * Reverted a change made a long time ago that uses
two file
> >>> >> >>> descriptors for each database. See the todo list.
> >>> >> >>>
> >>> >> >>> * The reason to remove the second fd is so that we
can rewrite
> ref
> >>> >> >>> counting. Better ref counting makes everyone happy,
but the real
> >>> >> >>> reason is for this next bullet point:
> >>> >> >>>
> >>> >> >>> * Optimize couch_server to not require a round trip
message pass
> >>> for
> >>> >> >>> opening a database that’s in the LRU. This is a
significant
> >>> >> >>> performance boost for high concurrency access. We
also optimized
> >>> >> >>> couch_server internals to not blow up when it’s
under load.
> >>> >> >>>
> >>> >> >>> * Introduce a #leaf{} record into the revision trees.
This is
> never
> >>> >> >>> written to disk but makes internal code a lot cleaner
when
> dealing
> >>> >> >>> with multiple versions of rev tree values.
> >>> >> >>>
> >>> >> >>> * Some changes to couch_changes to enable clustered
access. Also
> >>> some
> >>> >> >>> general cleanup
> >>> >> >>>
> >>> >> >>> * Internal changes to how CouchDB is booted in Erlang
land. Not
> >>> very
> >>> >> >>> sexy but this removes a lot of complicated un-Erlangy
bits. We
> >>> still
> >>> >> >>> have a bit of work left here.
> >>> >> >>>
> >>> >> >>> * btree chunk sizes are now configurable which can
allow people
> to
> >>> >> >>> adjust the RAM/speed tradeoffs a bit more.
> >>> >> >>>
> >>> >> >>> * We now load update validation functions on the first
write.
> This
> >>> is
> >>> >> >>> a cluster-motivated change because the clustered version
of this
> >>> call
> >>> >> >>> is expensive and can lead to race conditions when
opening a
> bunch
> >>> of
> >>> >> >>> db shards simultaneously. This should be invisible
to external
> >>> >> >>> clients.
> >>> >> >>>
> >>> >> >>> * Disabled conflict detection for local docs. They
don’t
> replicate
> >>> so
> >>> >> >>> there’s no point. This just led to clusters getting
stuck and
> >>> confused
> >>> >> >>> when there were lots of replications happening.
> >>> >> >>>
> >>> >> >>> * Changes to the multipart/mime parsing code. Necessary
for
> >>> clustered
> >>> >> >>> attachment uploads to split the incoming data  stream
into N
> >>> copies.
> >>> >> >>>
> >>> >> >>> * Don’t use init:restart/0 when reloading the ICU
driver. I
> think
> >>> >> >>> this has a bug. But we should rewrite this driver
to be a NIF
> >>> anyway.
> >>> >> >>>
> >>> >> >>> * New couch OS process manager. Significantly faster
access to
> OS
> >>> >> >>> processes under heavy load. This replaces the hard
limit with a
> >>> soft
> >>> >> >>> limit. Process spawned over the soft limit will be
used until
> >>> they’ve
> >>> >> >>> sat idle for a few minutes and then be closed. We
have a todo
> item
> >>> to
> >>> >> >>> add the hard ceiling back in (while keeping the soft
ceiling).
> >>> >> >>>
> >>> >> >>> * Automatically replace some easily identifiable JS
reductions
> with
> >>> >> >>> their builtin counterparts. Uses a regex to do the
detection so
> its
> >>> >> >>> not too smart.
> >>> >> >>>
> >>> >> >>> * Improved view updater write batch.
> >>> >> >>>
> >>> >> >>> * Updates to couchjs’ views.js to improve index
update speeds
> >>> >> >>>
> >>> >> >>> * Updates to the _stats bultin reduce to allow reduces
to work
> over
> >>> >> >>> emitted stats objects. Sometimes clients have summary
data in a
> >>> doc,
> >>> >> >>> and this allows them to combine stats if they follow
the same
> >>> pattern
> >>> >> >>> as the builtin expects.
> >>> >> >>>
> >>> >> >>> * Added a config:reload() that is accessible by POST’ing
to
> >>> >> >>> _config/_reload. Used by the JS tests to reset the
config to
> >>> what's on
> >>> >> >>> disk. This should prevent those test run failures
where a test
> >>> fails
> >>> >> >>> leaving the config in a bad state causing all subsequent
tests
> to
> >>> >> >>> fail. I think. Maybe.
> >>> >> >>>
> >>> >> >>> * Databases are deleted synchronously in the test
suite. We may
> >>> need
> >>> >> >>> to address this on Windows. But it does seem to reduce
the
> number
> >>> of
> >>> >> >>> “{error, file_exists}” failures.
> >>> >> >>>
> >>> >> >>> * I reimplemented the JS restartServer() function.
There’s a new
> >>> >> >>> _restart/token URL that will given a unique value
for each
> >>> instance of
> >>> >> >>> the Erlang VM. To run a restart we grab the current
token value,
> >>> hit
> >>> >> >>> _restart, then wait till we get a successful response
with a
> >>> different
> >>> >> >>> token. This appears to have made the restart strategy
more
> robust.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Things that need doing
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> IP Clearance -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We’ll need to track down if we have the CCLA as
well as look at
> >>> each
> >>> >> >>> source file added to make sure each one is strictly
from
> Cloudant
> >>> or
> >>> >> >>> has an amenable license. I’m pretty sure that the
only one of
> >>> interest
> >>> >> >>> is trunc_io.erl but we need to be thorough.
> >>> >> >>>
> >>> >> >>> documentation -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> There shouldn’t be much here since the entire point
of this
> merge
> >>> was
> >>> >> >>> to not change the visible behavior of single node
couch. A few
> >>> things
> >>> >> >>> to add about the testing endpoints. Maybe an update
to the
> >>> compaction
> >>> >> >>> section mention the two new file names used.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Copyright notices -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We need to strip out copyright notices from individual
files and
> >>> make
> >>> >> >>> sure all files have a standard Apache License v2 header.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> clustered vhosts -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We’ve never implemented this at Cloudant. We either
need to
> write a
> >>> >> >>> cluster or go back and tell people to use HAProxy
(or similar)
> for
> >>> >> >>> such things.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> twig -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We need to add another output type to twig that is
configurable
> in
> >>> >> >>> some manner. Right now we spit out entire rsyslog
records which
> >>> isn’t
> >>> >> >>> useful for most people. We’ll need to implement
the file writer
> >>> from
> >>> >> >>> couch_log as well as update the _log HTTP handler
to know when
> it
> >>> can
> >>> >> >>> and can’t expect to find data on disk.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> fabric -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> This is going to need a lot of work. Specifically
view access is
> >>> going
> >>> >> >>> to need to be updated to work with couch_mrview and
friends.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Boot a dev cluster -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Once we fix up the clustering code we’ll need to
write
> instructions
> >>> >> >>> and scripts for pulling up a dev cluster.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> OTP stuff -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We’ve updated each app but we still need to pull
some parts out
> of
> >>> >> >>> couchdb into their own application. Specifically the
HTTP layer
> >>> needs
> >>> >> >>> its own app. We could probably pull out the os
> >>> process/query_servers
> >>> >> >>> as well as the os daemons and friends. Once done we
need to
> update
> >>> the
> >>> >> >>> supervision trees so we don’t have things like couch
starting
> and
> >>> >> >>> managing the replication manager process.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> ddoc_cache -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Wire this up in couch_httpd_db to actually be used.
Right now
> its
> >>> only
> >>> >> >>> used in chttpd.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> couch_file upgrade -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> The revert to remove the second updater_fd from each
#db{}
> record
> >>> >> >>> means that we’re back in the original position of
files
> appearing
> >>> to
> >>> >> >>> slow down significantly under load. Since the initial
hammer
> >>> approach
> >>> >> >>> of just adding a second fd we’ve since discovered
that the
> >>> underlying
> >>> >> >>> bug is due to the way that message passing works combined
with
> >>> >> >>> Erlang’s file io. Significantly though is the fact
that the fix
> is
> >>> >> >>> rather simple to implement. A first draft of this
work is on an
> old
> >>> >> >>> branch of mine here:
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>   https://github.com/davisp/couchdb/commit/d856878
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> finish the size calculating changes -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> The #leaf{} record change is to enable us to add more
data size
> >>> >> >>> calculations. CouchDB master calculates a data size
that account
> >>> for
> >>> >> >>> all bytes that are active in a .couch file. Cloudant
is
> interested
> >>> in
> >>> >> >>> the total size of uncompressed docs and attachments
minus the
> >>> internal
> >>> >> >>> overhead of btrees. And there’s a fourth number
to calculate
> based
> >>> on
> >>> >> >>> the compression level used. Having each of these numbers
will be
> >>> >> >>> useful as well as the calculations they’ll enable
(ie, dead
> bytes
> >>> in
> >>> >> >>> file, bytes used for overhead, compression ratio achieved,
etc).
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> couch_proc_manager -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> We need to implement the hard ceiling for capping
the number of
> OS
> >>> >> >>> processes. We’ve started seeing a need for this
at Cloudant with
> >>> some
> >>> >> >>> work loads so motivation to fix this is high. The
only failing
> >>> etap is
> >>> >> >>> the assertion of this ceiling.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Synchronous db delete on Windows -
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> I did this because running the test suite was driving
me
> bonkers. I
> >>> >> >>> need to ask Dave about how this behaves on Windows
(my guess is
> not
> >>> >> >>> well) but I think we can close things up so that it
works better
> >>> than
> >>> >> >>> the status quo.
> >>> >> >>
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> > Iris Couch
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > NS
> >>>
> >>
> >>
> >>
> >> --
> >> NS
> >>
> >
> >
> >
> > --
> > NS
>



-- 
NS

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message