incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: Need help diagnosing an error setting up continuous replication
Date Wed, 02 Jun 2010 18:13:22 GMT
MochiWeb has a hard-coded limit of 2048 concurrent connections.
I reported this issue with a fix that adds a configuration option:
https://issues.apache.org/jira/browse/COUCHDB-705

Also be sure to check the operating system limits (ulimit command) and
make sure that the Erlang VM has enough file descriptors to handle the
connections.

On Wed, Jun 2, 2010 at 11:06, Mark Anderson <mark@opscode.com> wrote:
> I've been testing how well couchdb scales with lots of continuous
> replications, and I'm hitting some problems around 2045 simultaneous
> replications.
>
> Specifically, I'm getting a timeout on the destination server. Posting:
> {"continuous":true,"target":"http
> ://127.0.0.1:5984/testdb_2631","source":"http://10.194.143.191:5984/testdb_2631"}
>
> appears to time out in the couch_rep_changes_feed.erl init routine;
>
> The source server doesn't seem to log anything more than a changes feed request.
>
> Any suggestions on how to debug this? I'm an erlang newbie, so I'm
> still climbing the learning curve on figuring stuff out. It seems to
> happen more or less at the same point in the sequence of setting up
> the replications.
>
> Environment:
> Couch 0.11.0
>
> local.ini changes:
> Added
> max_dbs_open = 10000
>
> The erlang process is running with
>
> % ulimit -a
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 20
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 16382
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1000000
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) unlimited
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> % export ERL_MAX_PORTS=100000
> % export ERL_MAX_ETS_TABLES=10000
>
> Two ec2 'large' instances running ubuntu 10.04 are being used, each
> with a copy of couchdb.
>
> Log messages from the source couch logfile:
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25572.0>] 10.194.186.95 - -
> 'GET' /testdb_2631/ 200
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25574.0>] 10.194.186.95 - -
> 'GET' /testdb_2631/ 200
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25575.0>] 10.194.186.95 - -
> 'GET' /testdb_2631/_local%2Fe8eee148acfbe51c4ada8
> 2f7b4cc473a 404
>
> [Wed, 02 Jun 2010 00:57:40 GMT] [info] [<0.25606.0>] 10.194.186.95 - -
> 'GET' /testdb_2631/_changes?style=all_docs&heartbeat=10000&since=0&feed=continuous
> 200
>
> Log messages from the destination couch logfile:
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31607.2>] 10.194.143.191 -
> - 'PUT' /testdb_2631 412
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31609.2>] 127.0.0.1 - -
> 'GET' /testdb_2631/ 200
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31641.2>] 127.0.0.1 - -
> 'GET' /testdb_2631/ 200
>
> [Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31658.2>] 127.0.0.1 - -
> 'GET' /testdb_2631/_local%2Fe8eee148acfbe51c4ada82f7b
> 4cc473a 404
>
> [Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31676.2>] {error_report,<0.31.0>,
>     {<0.31676.2>,crash_report,
>      [[{initial_call,{couch_rep_changes_feed,init,['Argument__1']}},
>        {pid,<0.31676.2>},
>        {registered_name,[]},
>        {error_info,
>            {exit,changes_timeout,
>                [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,
>            [<0.31624.2>,couch_rep_sup,couch_primary_services,couch_server_sup,
>             <0.32.0>]},
>        {messages,[]},
>        {links,[<0.31624.2>,<0.31677.2>]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,610},
>        {stack_size,24},
>        {reductions,711}],
>       [{neighbour,
>            [{pid,<0.31677.2>},
>             {registered_name,[]},
>             {initial_call,{ibrowse_http_client,init,['Argument__1']}},
>             {current_function,{gen_server,loop,6}},
>             {ancestors,
>                 [<0.31676.2>,<0.31624.2>,couch_rep_sup,couch_primary_services,
>                  couch_server_sup,<0.32.0>]},
>             {messages,[]},
>             {links,[<0.31676.2>,#Port<0.17054>]},
>             {dictionary,
>                 [{my_trace_flag,false},
>                  {ibrowse_trace_token,["10.194.143.191",58,"5984"]}]},
>             {trap_exit,false},
>             {status,waiting},
>             {heap_size,987},
>             {stack_size,9},
>             {reductions,461}]}]]}}
>
> [Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31624.2>] {error_report,<0.31.0>,
>               {<0.31624.2>,crash_report,
>                [[{initial_call,{couch_rep,init,['Argument__1']}},
>                  {pid,<0.31624.2>},
>                  {registered_name,[]},
>                  {error_info,{exit,{{badmatch,{error,changes_timeout}},
>                                     [{couch_rep,do_init,1},
>                                      {couch_rep,init,1},
>                                      {gen_server,init_it,6},
>                                      {proc_lib,init_p_do_apply,3}]},
>                                    [{gen_server,init_it,6},
>                                     {proc_lib,init_p_do_apply,3}]}},
>                  {ancestors,[couch_rep_sup,couch_primary_services,
>                              couch_server_sup,<0.32.0>]},
>                  {messages,[{'EXIT',<0.31676.2>,changes_timeout}]},
>                  {links,[<0.81.0>]},
>                  {dictionary,[]},
>                  {trap_exit,true},
>                  {status,running},
>                  {heap_size,1597},
>                  {stack_size,24},
>                  {reductions,3240}],
>                 []]}}
>
> [Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31608.2>] Uncaught error
> in HTTP request: {error,
>                                  {case_clause,
>                                   {error,
>                                    {{{badmatch,{error,changes_timeout}},
>                                      [{couch_rep,do_init,1},
>                                       {couch_rep,init,1},
>                                       {gen_server,init_it,6},
>                                       {proc_lib,init_p_do_apply,3}]},
>                                     {child,undefined,
>
> "e8eee148acfbe51c4ada82f7b4cc473a+continuous",
>                                      {gen_server,start_link,
>                                       [couch_rep,
>                                        ["e8eee148acfbe51c4ada82f7b4cc473a",
>                                         {[{<<"continuous">>,true},
>                                           {<<"target">>,
>
> <<"http://127.0.0.1:5984/testdb_2631">>},
>                                           {<<"source">>,
>
> <<"http://10.194.143.191:5984/testdb_2631">>}]},
>                                         {user_ctx,null,
>                                          [<<"_admin">>],
>                                          <<"{couch_httpd_auth,
> default_authentication_handler}">>}],
>                                        []]},
>                                      temporary,1,worker,
>                                      [couch_rep]}}}}}
>
> [Wed, 02 Jun 2010 00:57:20 GMT] [info] [<0.31608.2>] Stacktrace:
> [{couch_rep,start_replication_server,1},
>              {couch_rep,replicate,2},
>              {couch_httpd_misc_handlers,handle_replicate_req,1},
>              {couch_httpd,handle_request_int,5},
>              {mochiweb_http,headers,5},
>              {proc_lib,init_p_do_apply,3}]
>
Mime
View raw message