couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Vander Wilt <nate-li...@calftrail.com>
Subject Re: couchdb replication memory consumption
Date Mon, 18 Nov 2013 23:06:22 GMT
Sorry to hear you are having trouble.

It's odd this started only in 1.5, otherwise I'd wonder if it were related to https://issues.apache.org/jira/browse/COUCHDB-1874
— that's an older bug. I know I've had trouble with large replications hanging lately too,
but haven't been able to track it down.

hth,
-nvw



On Nov 17, 2013, at 2:13 PM, Edward Levin <edward.levin@makerstudios.com> wrote:

> Hi,
> 
> I recently compiled and installed CouchDB 1.5.0 on Ubuntu 12.04 with the
> intention or creating my own replica of the NPM registry. I followed the
> official NPM (https://github.com/isaacs/npmjs.org) and kicked off
> replication of the NPM database (50gb+):
> 
> curl -X POST http://@10.0.0.34:5984/_replicate -d '{"source":"
> http://isaacs.iriscouch.com/registry/", "target":"registry",
> "continuous":true, "create_target":true}' -H "Content-Type:
> application/json"
> 
> Replication was proceeding smoothly for a few days until the database
> reached 48gb. From this point forward replication crashes every time after
> a few minutes with the following error:
> 
> 
> [Sun, 17 Nov 2013 22:10:25 GMT] [error] [<0.609.0>] ** Generic server
> <0.609.0> terminating
> ** Last message in was {#Port<0.3446>,{exit_status,137}}
> ** When Server state == {os_proc,"/usr/local/bin/couchjs
> /usr/local/share/couchdb/server/main.js",
>                                 #Port<0.3446>,
>                                 #Fun<couch_os_process.2.132569728>,
>                                 #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> 
> [Sun, 17 Nov 2013 22:10:29 GMT] [error] [<0.609.0>] {error_report,<0.31.0>,
>                        {<0.609.0>,crash_report,
>                         [[{initial_call,
>                               {couch_os_process,init,['Argument__1']}},
>                           {pid,<0.609.0>},
>                           {registered_name,[]},
>                           {error_info,
>                               {exit,
>                                   {exit_status,137},
>                                   [{gen_server,terminate,6},
>                                    {proc_lib,init_p_do_apply,3}]}},
>                           {ancestors,
> 
> [couch_query_servers,couch_secondary_services,
>                                couch_server_sup,<0.32.0>]},
>                           {messages,[]},
>                           {links,[<0.104.0>]},
>                           {dictionary,[]},
>                           {trap_exit,false},
>                           {status,running},
>                           {heap_size,6765},
>                           {stack_size,24},
>                           {reductions,12152}],
>                          []]}}
> 
> 
> 
> Further investigation revealed that after replication is invoked, process:
> 
> /usr/lib/erlang/erts-5.8.5/bin/beam
> 
> starts consuming all available system memory within a few minutes (1.5gb
> ram + 1gb swap) until a crash and an OOM error above.
> 
> Result from _active_tasks prior to crash:
> 
> [{"pid":"<0.397.0>","checkpointed_source_seq":729415,"continuous":true,"doc_id":null,"doc_write_failures":0,"docs_read":0,"docs_written":0,"missing_revisions_found":0,"progress":92,"replication_id":"42d81068841a085e7120226f0a010519+continuous+create_target","revisions_checked":982,"source":"
> http://isaacs.iriscouch.com/registry/
> ","source_seq":787280,"started_on":1384725566,"target":"registry","type":"replication","updated_on":1384725571}]
> 
> Tried reducing worker_processes to 1 and worker_batch_size to 100 with no
> effect.
> 
> At this point not sure if this behavior might be due to a memory leak,
> insufficient resources, or a misconfiguration.
> 
> Any help would be appreciated,
> 
> Ed


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message