couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Cottlehuber (JIRA)" <>
Subject [jira] [Commented] (COUCHDB-1946) Trying to replicate NPM grinds to a halt after 40GB
Date Mon, 09 Dec 2013 09:55:07 GMT


Dave Cottlehuber commented on COUCHDB-1946:

[~stelcheck] agreed

There's something with replicating this specific doc that seems to trigger issues. Here's
what I used to identify it (call source db and use since= <checkpoint -1)\?limit\=2\&since\=701251

here's some things you can try:

# option 1

-  delete all existing replications
- compact your DB if there's a big difference between data size and on-disk size. jq is awesome
for this.

curl -s http://localhost:5984/registry | jq ' (.disk_size| tonumber) - (.data_size |tonumber)'

This is a good spot to copy the registry.couch file if you have space, in case you need to
revert back to it.

-  replicate the single failing document by POSTing this to _replicator. This could take a

   "source": "",
   "target": "registry",
   "doc_ids": [
   "owner": "admin",

- this is simply replicating the single stuck document. If you do this, I would love an ngrep
or tcpdump of the traffic to see what happens on the wire during these stuck transfers

- once this is completed, you can then run the normal replication again.

# option 2

Install an older release of CouchDB and see if it doesn't get stuck here:

If you *can* please try the R15B03-1 release first, report back, and then the R14B04 one.
It's not yet clear to me if the issue we are seeing is also related to garbage collection
differences in Erlang/OTP between releases, or solely within CouchDB.

# option 3

Sometime later (hopefully today), I should have a bitttorrent accessible version of npm. I
need to update & compact first, this is pretty much IO limited :-).

> Trying to replicate NPM grinds to a halt after 40GB
> ---------------------------------------------------
>                 Key: COUCHDB-1946
>                 URL:
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>            Reporter: Marc Trudel
>         Attachments: couch.log
> I have been able to replicate the Node.js NPM database until 40G or so, then I get this:
> I one case I have gotten a flat-out OOM error, but I didn't take a dump of the log output
at the time.
> CentOS6.4 with CouchDB 1.5 (also tried 1.3.1, but to no avail). Also tried to restart
replication from scratch - twice - bot cases stalling at 40GB.

This message was sent by Atlassian JIRA

View raw message