couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy McKernan <timbitsandby...@gmail.com>
Subject active tasks, docs read, docs written
Date Fri, 13 Oct 2017 16:49:37 GMT
I'm using 1.6.1 and am looking for a way to gather replication-related
statuses.

docs_read and docs_written seemed to be what I want because, according to
the documentation, these vars tell us how many docs were read from the
source db and how many have already been written to the target's db.
http://docs.couchdb.org/en/1.6.1/api/server/common.html#replicate

However:
 - I'm doing continuous replication,
 - I've been inspecting the docs_read, docs_written and progress from the
_active_tasks request,
 - I've also inspected the history returned from
<source_db>/_local/<rep_id>,
 - What I find is the docs_read and docs_written are always equal to one
another, even if progress says "75" or "37".
 - Whenever progress isn't 100% I expected to find docs_read to be greater
than docs_written.

Am I missing some documentation or discussion on these?

Below is example output from a replication log where 2000 docs are in the
process of being replicated to a remote server that has a slow network
connection.

{
  "_id": "_local\/e32a163a5904548e3d8abcf2919da029",
  "_rev": "0-133",
  "session_id": "6b0ee086205e33aaa693d53b7f0a7469",
  "source_last_seq": 4172,
  "replication_id_version": 3,
  "history": [
    {
      "session_id": "6b0ee086205e33aaa693d53b7f0a7469",
      "start_time": "Fri, 13 Oct 2017 13:46:59 GMT",
      "end_time": "Fri, 13 Oct 2017 13:50:01 GMT",
      "start_last_seq": 3354,
      "end_last_seq": 4172,
      "recorded_seq": 4172,
      "missing_checked": 818,
      "missing_found": 818,
      "docs_read": 818,
      "docs_written": 818,
      "doc_write_failures": 0
    },
    ...
  ]
}

Once it the 2000 docs are fully transferred the history looks like this.

    {
      "session_id": "6b0ee086205e33aaa693d53b7f0a7469",
      "start_time": "Fri, 13 Oct 2017 13:46:59 GMT",
      "end_time": "Fri, 13 Oct 2017 13:52:55 GMT",
      "start_last_seq": 3354,
      "end_last_seq": 5354,
      "recorded_seq": 5354,
      "missing_checked": 2000,
      "missing_found": 2000,
      "docs_read": 2000,
      "docs_written": 2000,
      "doc_write_failures": 0
    }

So missing_checked, missing_found, docs_read and docs_written are always
equal to each other, yet it's clear that 2000 docs were written to the
source fairly quickly and it took several minutes to get them all
transferred to the target.

Meanwhile one of the active tasks from that time look like this.
    {
        "pid": "<0.437.55>",
        "checkpoint_interval": 5000,
        "checkpointed_source_seq": 2358,
        "continuous": true,
        "doc_id": "9f1ef162bc57792cb9d895c95a279452",
        "doc_write_failures": 0,
        "docs_read": 1656,
        "docs_written": 1656,
        "missing_revisions_found": 1656,
        "progress": 70,
        "replication_id":
"df844cbf4fc4de4ee58546fcbce12b7a+continuous+create_target",
        "revisions_checked": 2152,
        "source": "highrate1",
        "source_seq": 3354,
        "started_on": 1507833076,
        "target": "https://[user]:*****@[hostname]/highrate1/",
        "type": "replication",
        "updated_on": 1507900317
    }
How should I be reading this? Is this saying there are 1656 docs on the
source and only 70% have been transferred?

Also, I'll be moving on to 2.x in the future. So if this behavior is
different or if there are different API's available for getting replication
status that'd be just as good an answer for me.

Thanks for your help,
Tim

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message