incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Levin <edward.le...@makerstudios.com>
Subject Re: couchdb replication memory consumption
Date Wed, 20 Nov 2013 18:01:46 GMT
So far I only tried 1.5 and was somewhat reluctant to start from scratch
with an earlier version since it already took some days to synch 50gb but
at this point I am out of options so I will give 1.2 a shot.

Thanks,

Ed


On Mon, Nov 18, 2013 at 3:06 PM, Nathan Vander Wilt <
nate-lists@calftrail.com> wrote:

> Sorry to hear you are having trouble.
>
> It's odd this started only in 1.5, otherwise I'd wonder if it were related
> to https://issues.apache.org/jira/browse/COUCHDB-1874 — that's an older
> bug. I know I've had trouble with large replications hanging lately too,
> but haven't been able to track it down.
>
> hth,
> -nvw
>
>
>
> On Nov 17, 2013, at 2:13 PM, Edward Levin <edward.levin@makerstudios.com>
> wrote:
>
> > Hi,
> >
> > I recently compiled and installed CouchDB 1.5.0 on Ubuntu 12.04 with the
> > intention or creating my own replica of the NPM registry. I followed the
> > official NPM (https://github.com/isaacs/npmjs.org) and kicked off
> > replication of the NPM database (50gb+):
> >
> > curl -X POST http://@10.0.0.34:5984/_replicate -d '{"source":"
> > http://isaacs.iriscouch.com/registry/", "target":"registry",
> > "continuous":true, "create_target":true}' -H "Content-Type:
> > application/json"
> >
> > Replication was proceeding smoothly for a few days until the database
> > reached 48gb. From this point forward replication crashes every time
> after
> > a few minutes with the following error:
> >
> >
> > [Sun, 17 Nov 2013 22:10:25 GMT] [error] [<0.609.0>] ** Generic server
> > <0.609.0> terminating
> > ** Last message in was {#Port<0.3446>,{exit_status,137}}
> > ** When Server state == {os_proc,"/usr/local/bin/couchjs
> > /usr/local/share/couchdb/server/main.js",
> >                                 #Port<0.3446>,
> >                                 #Fun<couch_os_process.2.132569728>,
> >                                 #Fun<couch_os_process.3.35601548>,5000}
> > ** Reason for termination ==
> > ** {exit_status,137}
> >
> > [Sun, 17 Nov 2013 22:10:29 GMT] [error] [<0.609.0>]
> {error_report,<0.31.0>,
> >                        {<0.609.0>,crash_report,
> >                         [[{initial_call,
> >                               {couch_os_process,init,['Argument__1']}},
> >                           {pid,<0.609.0>},
> >                           {registered_name,[]},
> >                           {error_info,
> >                               {exit,
> >                                   {exit_status,137},
> >                                   [{gen_server,terminate,6},
> >                                    {proc_lib,init_p_do_apply,3}]}},
> >                           {ancestors,
> >
> > [couch_query_servers,couch_secondary_services,
> >                                couch_server_sup,<0.32.0>]},
> >                           {messages,[]},
> >                           {links,[<0.104.0>]},
> >                           {dictionary,[]},
> >                           {trap_exit,false},
> >                           {status,running},
> >                           {heap_size,6765},
> >                           {stack_size,24},
> >                           {reductions,12152}],
> >                          []]}}
> >
> >
> >
> > Further investigation revealed that after replication is invoked,
> process:
> >
> > /usr/lib/erlang/erts-5.8.5/bin/beam
> >
> > starts consuming all available system memory within a few minutes (1.5gb
> > ram + 1gb swap) until a crash and an OOM error above.
> >
> > Result from _active_tasks prior to crash:
> >
> >
> [{"pid":"<0.397.0>","checkpointed_source_seq":729415,"continuous":true,"doc_id":null,"doc_write_failures":0,"docs_read":0,"docs_written":0,"missing_revisions_found":0,"progress":92,"replication_id":"42d81068841a085e7120226f0a010519+continuous+create_target","revisions_checked":982,"source":"
> > http://isaacs.iriscouch.com/registry/
> >
> ","source_seq":787280,"started_on":1384725566,"target":"registry","type":"replication","updated_on":1384725571}]
> >
> > Tried reducing worker_processes to 1 and worker_batch_size to 100 with no
> > effect.
> >
> > At this point not sure if this behavior might be due to a memory leak,
> > insufficient resources, or a misconfiguration.
> >
> > Any help would be appreciated,
> >
> > Ed
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message