couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vatam...@apache.org
Subject [couchdb] 03/04: [fixup] Fix README typos
Date Tue, 18 Apr 2017 06:29:12 GMT
This is an automated email from the ASF dual-hosted git repository.

vatamane pushed a commit to branch 63012-scheduler
in repository https://gitbox.apache.org/repos/asf/couchdb.git

commit 9aaa959e02e2fa46e93fda8d155aa5645afb0200
Author: Nick Vatamaniuc <vatamane@apache.org>
AuthorDate: Tue Apr 18 02:24:31 2017 -0400

    [fixup] Fix README typos
    
     Thanks, Paul and Ben!
---
 src/couch_replicator/README.md | 229 ++++++++++++++++++++---------------------
 1 file changed, 113 insertions(+), 116 deletions(-)

diff --git a/src/couch_replicator/README.md b/src/couch_replicator/README.md
index 32a58be..f08ff35 100644
--- a/src/couch_replicator/README.md
+++ b/src/couch_replicator/README.md
@@ -5,18 +5,18 @@ This description of scheduling replicator's functionality is mainly geared
to
 CouchDB developers. It dives a bit into the internal and explains how
 everything is connected together.
 
-A natural place to start is the top applicatin supervisor:
-`couch_replicator_sup`. It's a `rest_for_one` so if a child process
-terminates, the rest of the childred in the hierarchy following it are also
-terminated. This structure implies a useful constraint -- children lower in
-the list can safely call their siblings which are higher in the list.
+A natural place to start is the top application supervisor:
+`couch_replicator_sup`. It's a `rest_for_one` so if a child process terminates,
+the rest of the children in the hierarchy following it are also terminated.
+This structure implies a useful constraint -- children lower in the list can
+safely call their siblings which are higher in the list.
 
 A description of each child:
 
  * `couch_replication_event`: Starts a gen_event publication bus to handle some
     replication related events. This used for example, to publish cluster
     membership changes by the `couch_replicator_clustering` process. But is
-    also used in replication tests to minotor for replication events.
+    also used in replication tests to monitor for replication events.
     Notification is performed via the `couch_replicator_notifier:notify/1`
     function. It's the first (left-most) child because
     `couch_replicator_clustering` uses it.
@@ -43,10 +43,11 @@ A description of each child:
     `replicator.connection_close_interval` milliseconds in case another
     replication task wants to re-use it. It is worth pointing out how linking
     and monitoring is handled: Workers are linked to the connection pool when
-    they are created. If they crash connection pool listens for the EXIT event
-    and cleans up. Connection pool also monitors owners (by monitoring the the
-    `Pid` from the `From` argument in the call to `acquire/1`) and cleans up if
-    owner dies. Another interesting thing is that connection establishment
+    they are created. If they crash, the connection pool will receive an 'EXIT'
+    event and clean up after the worker. The connection pool also monitors
+    owners (by monitoring the `Pid` from the `From` argument in the call to
+    `acquire/1`) and cleans up if owner dies, and the pool receives a 'DOWN'
+    message. Another interesting thing is that connection establishment
     (creation) happens in the owner process so the pool is not blocked on it.
 
  * `couch_replicator_rate_limiter` : Implements a rate limiter to handle
@@ -55,47 +56,44 @@ A description of each child:
     control algorithm to converge on the channel capacity. Implemented using a
     16-way sharded ETS table to maintain connection state. The table sharding
     code is split out to `couch_replicator_rate_limiter_tables` module. The
-    main idea of the module it so maintain and continually estimate an interval
-    for each connection represented by the `{Method, Url}`. The interval is
-    updated accordingly on each call to `failure/1` or `success/1` calls. A
-    `failure/1` is supposed to be called after a 429 is received and
-    `success/1` when a successful request has been made. Also when no failures
-    are happening the code is ensuring the ETS tables are empty in order to
-    have a lower impact on a running system.
-
- * `couch_replicator_scheduler` : Scheduler is the core component of the
-    scheduling replicator. It allows handling a larger number of jobs than
-    might be possible to actively run on the cluster. It accomplishes this by
-    switching between jobs (stopping some and starting others) to ensure all
-    make progress. Replication jobs which fail are penalized using exponential
-    backoff. That is, each consecutive failure will double the time penalty.
-    This frees up system resources for more useful work than just continuously
-    trying to run the same subset of failing jobs.
-
-    The main API function is `add_job/1`. Its argument is an instance of
-    `#rep{}` record, which could also be the result of a document update from a
-    _replicator db or it could be the result of a POST to `_replicate`
-    endpoint. Once the replication job is added to the scheduler it doesn't
-    matter much where it originated.
+    purpose of the module it so maintain and continually estimate sleep
+    intervals for each connection represented as a `{Method, Url}` pair. The
+    interval is updated accordingly on each call to `failure/1` or `success/1`
+    calls. For a successful request, a client should call `success/1`. Whenever
+    a 429 response is received the client should call `failure/1`. When no
+    failures are happening the code is ensuring the ETS tables are empty in
+    order to have a lower impact on a running system.
+
+ * `couch_replicator_scheduler` : This is the core component of the scheduling
+    replicator. It's main task is to switch between replication jobs, by
+    stopping some and starting others to ensure all of them make progress.
+    Replication jobs which fail are penalized using an exponential backoff.
+    That is, each consecutive failure will double the time penalty. This frees
+    up system resources for more useful work than just continuously trying to
+    run the same subset of failing jobs.
+
+    The main API function is `add_job/1`. Its argument is an instance of the
+    `#rep{}` record, which could be the result of a document update from a
+    `_replicator` db or the result of a POST to `_replicate` endpoint.
 
     Each job internally is represented by the `#job{}` record. It contains the
-    original `#rep{}` but also, among a few other things, maintain an event
-    history. The history maintains a sequence of events of each job. These are
-    timestamped and ordered such that the most recent event is at the head.
-    History length is limited based on the `replicator.max_history` config
-    value. The default is 20 entries. History events types are:
+    original `#rep{}` but also, maintains an event history. The history is a
+    sequence of past events for each job. These are timestamped and ordered
+    such that the most recent event is at the head. History length is limited
+    based on the `replicator.max_history` configuration value. The default is
+    20 entries. History events types are:
 
     * `added` : job was just added to the scheduler. This is the first event.
     * `started` : job was started. This was an attempt to run the job.
     * `stopped` : job was stopped by the scheduler.
     * `crashed` : job has crashed (instead of stopping cleanly).
 
-    The core of the algorithm is the `reschedule/1` function. That function is
-    called every `replicator.interval` milliseconds (default is 60000 i.e. a
-    minute). During each call scheduler will try to stop some jobs, start some
-    new ones and will also try to keep the maximum amount of jobs running less
-    than `replicator.max_jobs` (deafult is 500). So the functions does these
-    operations (actual code paste):
+    The core of the scheduling algorithm is the `reschedule/1` function. This
+    function is called every `replicator.interval` milliseconds (default is
+    60000 i.e. a minute). During each call the scheduler will try to stop some
+    jobs, start some new ones and will also try to keep the maximum number of
+    jobs running less than `replicator.max_jobs` (deafult 500). So the
+    functions does these operations (actual code paste):
 
     ```
     Running = running_job_count(),
@@ -106,17 +104,17 @@ A description of each child:
     update_running_jobs_stats(State#state.stats_pid)
     ```
 
-    `Running` is gathering the total number of currently runnig jobs. `Pending`
-    is the total number of jobs waiting to be run. `stop_excess_jobs` will stop
-    any exceeding `replicator.max_jobs` configured limit. This code takes
-    effect if user reduces `max_jobs` configuration value. `start_pending_jobs`
-    will start any jobs if there is more room available. This will take effect
-    on startup or when user increases `max_jobs` configuration value.
-    `rotate_jobs` is where all the action happens. There scheduler picks
-    `replicator.max_churn` running jobs to stop and then picks the same number
-    of pending jobs to start. The default value of `max_churn` is 20. So by
-    default every minute, 20 running jobs are stopped, and 20 new pending jobs
-    are started.
+    `Running` is the total number of currently runnig jobs. `Pending` is the
+    total number of jobs waiting to be run. `stop_excess_jobs` will stop any
+    exceeding the `replicator.max_jobs` configured limit. This code takes
+    effect if user reduces the `max_jobs` configuration value.
+    `start_pending_jobs` will start any jobs if there is more room available.
+    This will take effect on startup or when user increases the `max_jobs`
+    configuration value. `rotate_jobs` is where all the action happens. The
+    scheduler picks `replicator.max_churn` running jobs to stop and then picks
+    the same number of pending jobs to start. The default value of `max_churn`
+    is 20. So by default every minute, 20 running jobs are stopped, and 20 new
+    pending jobs are started.
 
     Before moving on it is worth pointing out that scheduler treats continuous
     and non-continuous replications differently. Normal (non-continuous)
@@ -124,11 +122,11 @@ A description of each child:
     behavior is to preserve their semantics of replicating a snapshot of the
     source database to the target. For example if new documents are added to
     the source after the replication are started, those updates should not show
-    up on the target database. Stopping and restring a normal replication would
-    violate that constraint. The only exception to the rule is the user
+    up on the target database. Stopping and restarting a normal replication
+    would violate that constraint. The only exception to the rule is the user
     explicitly reduces `replicator.max_jobs` configuration value. Even then
     scheduler will first attempt to stop as many continuous jobs as possible
-    and only if it has no choice left, it will stop normal jobs.
+    and only if it has no choice left will it stop normal jobs.
 
     Keeping that in mind and going back to the scheduling algorithm, the next
     interesting part is how the scheduler picks which jobs to stop and which
@@ -138,9 +136,9 @@ A description of each child:
       running continuous jobs first. The sorting callback function to get the
       longest running jobs is unsurprisingly called `longest_running/2`. To
       pick the longest running jobs it looks at the most recent `started`
-      event. After it gets a sorted list by longest running, it simply
-      picks first few depending on the value of `max_churn` using
-      `lists:sublist/2`. Then those jobs are stopped.
+      event. After it gets a sorted list by longest running, it simply picks
+      first few depending on the value of `max_churn` using `lists:sublist/2`.
+      Then those jobs are stopped.
 
     * Starting: When starting the scheduler will pick the jobs which have been
       waiting the longest. Surprisingly, in this case it also looks at the
@@ -176,74 +174,75 @@ A description of each child:
 
     There is subtlety when calculating consecutive crashes and that is deciding
     when the sequence stops. That is, figuring out when a job becomes healthy
-    again. Scheduler considers a job healthy again if it started and hasn't
+    again. The scheduler considers a job healthy again if it started and hasn't
     crashed in a while. The "in a while" part is a configuration parameter
     `replicator.health_threshold` defaulting to 2 minutes. This means if job
     has been crashing, for example 5 times in a row, but then on the 6th
     attempt it started and ran for more than 2 minutes then it is considered
-    healthy again. Next time it crashes its sequence of consecutive crashes
+    healthy again. The next time it crashes its sequence of consecutive crashes
     will restart at 1.
 
  * `couch_replicator_scheduler_sup`: This module is a supervisor for running
    replication tasks. The most interesting thing about it is perhaps that it is
-   not used to restart children. Scheduler handles restarts and error handling
-   backoffs.
+   not used to restart children. The scheduler handles restarts and error
+   handling backoffs.
 
- * `couch_replicator_doc_processor`: Doc procesoor component is in charge of
-   processing replication document updates, turning them into replication jobs
-   and adding those jobs to the scheduler. Unfortunately the only reason there
-   is even a `couch_replicator_doc_processor` gen_server, instead of
+ * `couch_replicator_doc_processor`: The doc procesoor component is in charge
+   of processing replication document updates, turning them into replication
+   jobs and adding those jobs to the scheduler. Unfortunately the only reason
+   there is even a `couch_replicator_doc_processor` gen_server, instead of
    replication documents being turned to jobs and inserted into the scheduler
-   directly, is because of one corner case - filtered replications using custom
-   (Javascript mostly) filter. More about it later. It is better to start with
-   how updates flow through the doc processor:
+   directly, is because of one corner case -- filtered replications using
+   custom (Javascript mostly) filters. More about this later. It is better to
+   start with how updates flow through the doc processor:
 
-   Document updates are coming via the `db_change/3` callback from
+   Document updates come via the `db_change/3` callback from
    `couch_multidb_changes`, then go to the `process_change/2` function.
 
    In `process_change/2` a few decisions are made regarding how to proceed. The
-   first is "ownership" checking. That is a check if replication document
-   belongs on the current node. If not, then it is ignored. Another check is to
-   see if the update has arrived during a time when the cluster is considered
-   "unstable". If so, it is ignored, because soon enough a rescan will be
-   launched and all the documents will be reprocessed anyway. Another
-   noteworthy thing in `process_change/2` is handling of upgrades from the
-   previous version of the replicator when transient states were written to the
-   documents. Two such states were `triggered` and `error`. Both of those
-   states are removed from the document then update proceeds in the regular
-   fashion. `failed` documents are also ignored here. `failed` is a terminal
-   state which indicates the document was somehow unsuitable to become a
-   replication job (it was malforemd or a duplicate). Otherwise the state
-   update proceeds to `process_updated/2`.
+   first is "ownership" check. That is a check if the replication document
+   belongs on the current node. If not, then it is ignored. In a cluster, in
+   general there would be N copies of a document change and we only want to run
+   the replication once. Another check is to see if the update has arrived
+   during a time when the cluster is considered "unstable". If so, it is
+   ignored, because soon enough a rescan will be launched and all the documents
+   will be reprocessed anyway. Another noteworthy thing in `process_change/2`
+   is handling of upgrades from the previous version of the replicator when
+   transient states were written to the documents. Two such states were
+   `triggered` and `error`. Both of those states are removed from the document
+   then then update proceeds in the regular fashion. `failed` documents are
+   also ignored here. `failed` is a terminal state which indicates the document
+   was somehow unsuitable to become a replication job (it was malforemd or a
+   duplicate). Otherwise the state update proceeds to `process_updated/2`.
 
    `process_updated/2` is where replication document updates are parsed and
    translated to `#rep{}` records. The interesting part here is that the
    replication ID isn't calculated yet. Unsurprisingly the parsing function
    used is called `parse_rep_doc_without_id/1`. Also note that up until now
-   everything is still running in the context of the `db_change/3`
-   callback. After replication filter type is determined, finally the update
-   gets passed to the `couch_replicator_doc_processor` gen_server.
+   everything is still running in the context of the `db_change/3` callback.
+   After replication filter type is determined the update gets passed to the
+   `couch_replicator_doc_processor` gen_server.
 
-   `couch_replicator_doc_processor` gen_server's main role is to try to
+   The `couch_replicator_doc_processor` gen_server's main role is to try to
    calculate replication IDs for each `#rep{}` record passed to it, then add
    that as a scheduler job. As noted before, `#rep{}` records parsed up until
    this point lack a replication ID. The reason is replication ID calculation
-   include a hash of the filter code. And because user defined replication
+   includes a hash of the filter code. And because user defined replication
    filters live in the source DB, which most likely involves a remote network
-   fetch, there is a possibility of blocking, and a need to handle various
+   fetch there is a possibility of blocking and a need to handle various
    network failures and retries. Because of that `replication_doc_processor`
-   dispatchies most of that blocking and retrying to a separate `worker`
-   process (`couch_replicator_doc_processor_worker` module).
+   dispatches all of that blocking and retrying to a separate `worker` process
+   (`couch_replicator_doc_processor_worker` module).
 
-   `couch_replicator_doc_processor_worker` is where a replication IDs are
+   `couch_replicator_doc_processor_worker` is where replication IDs are
    calculated for each individual doc update. There are two separate modules
    which contain utilities related to replication ID calculation:
    `couch_replicator_ids` and `couch_replicator_filters`. The first one
    contains ID calculation algorithms and the second one knows how to parse and
-   fetch user filters from remote source DB. One interesting thing about the
+   fetch user filters from a remote source DB. One interesting thing about the
    worker is that it is time-bounded and is guaranteed to not be stuck forever.
-   That's why it spawn an extra process with `spawn_monitor`, just so it can do
-   an `after` clause in receive and bound the maximum time this workerw will
+   That's why it spawns an extra process with `spawn_monitor`, just so it can
+   do an `after` clause in receive and bound the maximum time this worker will
    take.
 
    A doc processor worker will either succeed or fail but never block for too
@@ -257,26 +256,25 @@ A description of each child:
 
      1. Filter fetching code has failed. In that case worker returns an error.
         But because the error could be a transient network error, another
-        worker is started to try again. It could fail and return error again,
-        then another one is started and so on. However each consecutive worker
-        will do an exponential backoff, not unlike the scheduler code.
+        worker is started to try again. It could fail and return an error
+        again, then another one is started and so on. However each consecutive
+        worker will do an exponential backoff, not unlike the scheduler code.
         `error_backoff/1` is where the backoff period is calculated.
         Consecutive errors are held in the `errcnt` field in the ETS table.
 
      2. Fetchig filter code succeeds, replication ID is calculated and job is
         added to the scheduler. However, because this is a filtered replication
-        source database could get an updated filter. Which means replication ID
-        should change again. So a worker is spawned again even if worker just
-        successfully returned successfully. The purpose is to check the filter
-        and see if it changed. So in other words doc processor will to do the
-        work of checking of filtered replications, get an updated filter and
-        will then refresh the replication job (remove the old one and add a new
-        one with a different ID). Filter checking interval is determined by the
-        `filter_backoff` function. An unusual thing about that function is that
-        it calculates the period based on the size of the ETS table. The
-        intuition is when there are few replications in a cluster, it's ok
-        to check the filter for changes often. When there are lots of
-        replications running, having each one checking their filter often is
+        the source database could get an updated filter. Which means
+        replication ID could change again. So the worker is spawned to
+        periodically check the filter and see if it changed. In other words doc
+        processor will do the work of checking for filtered replications, get
+        an updated filter and will then refresh the replication job (remove the
+        old one and add a new one with a different ID). The filter checking
+        interval is determined by the `filter_backoff` function. An unusual
+        thing about that function is it calculates the period based on the size
+        of the ETS table. The idea there is for a few replications in a
+        cluster, it's ok to check filter changes often. But when there are lots
+        of replications running, having each one checking their filter often is
         not a good idea.
 
  * `couch_replicator`: This is an unusual but useful pattern. This child is not
@@ -285,11 +283,10 @@ A description of each child:
    supervisor in the correct order (and monitored for crashes). This ensures
    the local replicator db exists, then returns `ignore`. This pattern is
    useful for doing setup-like things at the top level and in the correct order
-   regaring the rest of the children in the supervisor.
+   regdaring the rest of the children in the supervisor.
 
- * `couch_replicator_db_changes`: This process specializes and configure
+ * `couch_replicator_db_changes`: This process specializes and configures
    `couch_multidb_changes` so that it looks for `_replicator` suffixed shards
-   and makes sure to restart when cluster configuration changes. This restart
-   on cluster membership changes is often referred to as a "rescan".
+   and makes sure to restart it when node membership changes.
 
 

-- 
To stop receiving notification emails like this one, please contact
"commits@couchdb.apache.org" <commits@couchdb.apache.org>.

Mime
View raw message