incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Anderson <m...@opscode.com>
Subject Need help diagnosing an error setting up continuous replication
Date Wed, 02 Jun 2010 18:06:52 GMT
I've been testing how well couchdb scales with lots of continuous
replications, and I'm hitting some problems around 2045 simultaneous
replications.

Specifically, I'm getting a timeout on the destination server. Posting:
{"continuous":true,"target":"http
://127.0.0.1:5984/testdb_2631","source":"http://10.194.143.191:5984/testdb_2631"}

appears to time out in the couch_rep_changes_feed.erl init routine;

The source server doesn't seem to log anything more than a changes feed request.

Any suggestions on how to debug this? I'm an erlang newbie, so I'm
still climbing the learning curve on figuring stuff out. It seems to
happen more or less at the same point in the sequence of setting up
the replications.

Environment:
Couch 0.11.0

local.ini changes:
Added
max_dbs_open = 10000

The erlang process is running with

% ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1000000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

% export ERL_MAX_PORTS=100000
% export ERL_MAX_ETS_TABLES=10000

Two ec2 'large' instances running ubuntu 10.04 are being used, each
with a copy of couchdb.

Log messages from the source couch logfile:

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25572.0>] 10.194.186.95 - -
'GET' /testdb_2631/ 200
[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25574.0>] 10.194.186.95 - -
'GET' /testdb_2631/ 200

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.25575.0>] 10.194.186.95 - -
'GET' /testdb_2631/_local%2Fe8eee148acfbe51c4ada8
2f7b4cc473a 404

[Wed, 02 Jun 2010 00:57:40 GMT] [info] [<0.25606.0>] 10.194.186.95 - -
'GET' /testdb_2631/_changes?style=all_docs&heartbeat=10000&since=0&feed=continuous
200

Log messages from the destination couch logfile:

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31607.2>] 10.194.143.191 -
- 'PUT' /testdb_2631 412

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31609.2>] 127.0.0.1 - -
'GET' /testdb_2631/ 200

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31641.2>] 127.0.0.1 - -
'GET' /testdb_2631/ 200

[Wed, 02 Jun 2010 00:57:10 GMT] [info] [<0.31658.2>] 127.0.0.1 - -
'GET' /testdb_2631/_local%2Fe8eee148acfbe51c4ada82f7b
4cc473a 404

[Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31676.2>] {error_report,<0.31.0>,
    {<0.31676.2>,crash_report,
     [[{initial_call,{couch_rep_changes_feed,init,['Argument__1']}},
       {pid,<0.31676.2>},
       {registered_name,[]},
       {error_info,
           {exit,changes_timeout,
               [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},
       {ancestors,
           [<0.31624.2>,couch_rep_sup,couch_primary_services,couch_server_sup,
            <0.32.0>]},
       {messages,[]},
       {links,[<0.31624.2>,<0.31677.2>]},
       {dictionary,[]},
       {trap_exit,true},
       {status,running},
       {heap_size,610},
       {stack_size,24},
       {reductions,711}],
      [{neighbour,
           [{pid,<0.31677.2>},
            {registered_name,[]},
            {initial_call,{ibrowse_http_client,init,['Argument__1']}},
            {current_function,{gen_server,loop,6}},
            {ancestors,
                [<0.31676.2>,<0.31624.2>,couch_rep_sup,couch_primary_services,
                 couch_server_sup,<0.32.0>]},
            {messages,[]},
            {links,[<0.31676.2>,#Port<0.17054>]},
            {dictionary,
                [{my_trace_flag,false},
                 {ibrowse_trace_token,["10.194.143.191",58,"5984"]}]},
            {trap_exit,false},
            {status,waiting},
            {heap_size,987},
            {stack_size,9},
            {reductions,461}]}]]}}

[Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31624.2>] {error_report,<0.31.0>,
              {<0.31624.2>,crash_report,
               [[{initial_call,{couch_rep,init,['Argument__1']}},
                 {pid,<0.31624.2>},
                 {registered_name,[]},
                 {error_info,{exit,{{badmatch,{error,changes_timeout}},
                                    [{couch_rep,do_init,1},
                                     {couch_rep,init,1},
                                     {gen_server,init_it,6},
                                     {proc_lib,init_p_do_apply,3}]},
                                   [{gen_server,init_it,6},
                                    {proc_lib,init_p_do_apply,3}]}},
                 {ancestors,[couch_rep_sup,couch_primary_services,
                             couch_server_sup,<0.32.0>]},
                 {messages,[{'EXIT',<0.31676.2>,changes_timeout}]},
                 {links,[<0.81.0>]},
                 {dictionary,[]},
                 {trap_exit,true},
                 {status,running},
                 {heap_size,1597},
                 {stack_size,24},
                 {reductions,3240}],
                []]}}

[Wed, 02 Jun 2010 00:57:20 GMT] [error] [<0.31608.2>] Uncaught error
in HTTP request: {error,
                                 {case_clause,
                                  {error,
                                   {{{badmatch,{error,changes_timeout}},
                                     [{couch_rep,do_init,1},
                                      {couch_rep,init,1},
                                      {gen_server,init_it,6},
                                      {proc_lib,init_p_do_apply,3}]},
                                    {child,undefined,

"e8eee148acfbe51c4ada82f7b4cc473a+continuous",
                                     {gen_server,start_link,
                                      [couch_rep,
                                       ["e8eee148acfbe51c4ada82f7b4cc473a",
                                        {[{<<"continuous">>,true},
                                          {<<"target">>,

<<"http://127.0.0.1:5984/testdb_2631">>},
                                          {<<"source">>,

<<"http://10.194.143.191:5984/testdb_2631">>}]},
                                        {user_ctx,null,
                                         [<<"_admin">>],
                                         <<"{couch_httpd_auth,
default_authentication_handler}">>}],
                                       []]},
                                     temporary,1,worker,
                                     [couch_rep]}}}}}

[Wed, 02 Jun 2010 00:57:20 GMT] [info] [<0.31608.2>] Stacktrace:
[{couch_rep,start_replication_server,1},
             {couch_rep,replicate,2},
             {couch_httpd_misc_handlers,handle_replicate_req,1},
             {couch_httpd,handle_request_int,5},
             {mochiweb_http,headers,5},
             {proc_lib,init_p_do_apply,3}]
Mime
View raw message